git checkout --ours不会从未合并文件列表中删除文件

Question

git checkout --ours不会从未合并文件列表中删除文件

Cha*_*one 24 git repository git-merge git-checkout git-branch

嗨,我需要像这样合并两个分支.

这只是一个正在发生的事情的例子,我使用了数百个需要解决的文件.

git merge branch1
...conflicts...
git status
....
# Unmerged paths:
#   (use "git add/rm <file>..." as appropriate to mark resolution)
#
#   both added:   file1
#   both added:   file2
#   both added:   file3
#   both added:   file4
git checkout --ours file1
git chechout --theirs file2
git checkout --ours file3
git chechout --theirs file4
git commit -a -m "this should work"
U   file1
fatal: 'commit' is not possible because you have unmerged files.
Please, fix them up in the work tree, and then use 'git add/rm <file>' as
appropriate to mark resolution and make a commit, or use 'git commit -a'.

Run Code Online (Sandbox Code Playgroud)

当我这样做时git merge tool,只有来自"我们的"分支的内容正确,当我保存它时,文件将从未合并的列表中消失.但由于我有数百个这样的文件,所以这不是一个选择.

我认为这种方法会带给我我想去的地方 - 轻松说出我想要保留哪个分支的文件.

但我想我git checkout --ours/theirs在合并后误解了命令的概念.

能否请您提供一些信息,如何处理这种情况？我用git 1.7.1

Answer 1

tor*_*rek 87

这主要是git checkout内部如何运作的怪癖.Git人员倾向于让实现决定接口.

最终结果是,在git checkout使用--ours或之后--theirs,如果要解决冲突,则还必须git add使用相同的路径:

git checkout --ours -- path/to/file
git add path/to/file

Run Code Online (Sandbox Code Playgroud)

但其他形式的情况并非如此git checkout:

git checkout HEAD -- path/to/file

Run Code Online (Sandbox Code Playgroud)

要么:

git checkout MERGE_HEAD -- path/to/file

Run Code Online (Sandbox Code Playgroud)

(这些在多个方面略有不同).在某些情况下,这意味着最快的方法是使用middle命令.(顺便提一下,--这里是为了确保Git可以区分路径名和选项或分支名.例如,如果你有一个名为的文件--theirs,它看起来像一个选项,但--会告诉Git不,它真的是一个路径名.)

要了解这一切是如何在内部工作的,以及为什么你需要单独的,git add除非你不需要,请继续阅读.:-)首先,让我们快速回顾一下合并过程.

合并,第1部分:合并如何开始

当你运行:

$ git merge commit-or-branch

Git做的第一件事是找到命名提交和current()提交之间的合并基础HEAD.(请注意,如果您在此处提供分支名称,请将其git merge otherbranch转换为提交ID,即分支的提示.它保存最终合并日志消息的分支名称参数,但需要提交ID才能找到合并基地.)

找到合适的合并库后,¹ Git会生成两个git diff列表:一个来自合并库HEAD,另一个来自合并库到您确定的提交.这得到了"你改变了什么"和"他们改变了什么",Git现在必须将它们结合起来.

对于您进行了更改但未进行更改的文件,Git可以使用您的版本.

对于他们进行了更改而你没有进行更改的文件,Git可以采用他们的版本.

对于你们都进行了更改的文件,Git必须做一些真正的合并工作.它逐行比较变化,看它是否可以组合它们.如果它可以组合它们,它就是这样.如果合并基于似乎是基于纯粹的逐行比较 - 冲突,Git声明该文件的"合并冲突"(并继续并尝试合并,但留下冲突标记).

一旦Git合并了它可以做的一切,它就完成了合并 - 因为没有冲突或者因合并冲突而停止.

¹如果绘制提交图,则合并基础很明显.没有绘制图形,它有点神秘.这就是为什么我总是告诉人们绘制图形,或者至少,尽可能多地绘制图形.

技术定义是合并基础是提交图中的"最低共同祖先"(LCA)节点.在较少技术方面,它是您当前分支与您正在合并的分支加入的最新提交.也就是说,通过记录每个合并的父提交ID,Git能够找到两个分支在一起的最后时间,从而弄清楚你做了什么,以及他们做了什么.但是,为了使其工作,Git必须记录每个合并.具体来说,它必须将(或所有,对于所谓的"章鱼"合并)父ID写入新的合并提交.

在某些情况下,有不止一个合适的合并基础.然后,该过程取决于您的合并策略.默认的递归策略将合并多个合并基础以产生"虚拟合并基础".这是非常罕见的,你现在可以忽略它.

合并,第2部分:因冲突而停止,以及Git的"索引"

当Git以这种方式停止时,它需要给你一个解决冲突的机会.但这也意味着它需要记录冲突,这就是Git的"索引" - 也称为"临时区域",有时候"缓存" - 真正获得它的存在.

对于工作树中的每个分阶段文件,索引最多包含四个条目,而不是一个条目.其中最多有三个实际上正在使用,但有四个插槽,编号0通过3.

插槽零用于已解析的文件.当您使用Git而不进行合并时,只使用插槽零.在工作树中编辑文件时,它具有"未分级的更改",然后您git add将文件和更改写入存储库,更新插槽零; 你的变化现在"上演"了.

插槽1-3用于未解析的文件.当git merge必须以合并冲突停止时,它将插槽零留空,并将所有内容写入插槽1,2和3.文件的合并基本版本记录在插槽1中,--ours版本记录在插槽2中,并且--theirs版本记录在插槽3中.这些非零插槽条目是Git知道文件未解析的方式.²

当您解析文件时,git add它们会删除所有插槽1-3条目并写入一个slot-zero,staged-for-commit条目.这就是Git知道文件已解决并准备好进行新提交的方式.(或者,在某些情况下,你git rm是文件,在这种情况下,Git将一个特殊的"删除"值写入插槽零,再次擦除插槽1-3.)

²There are a few cases where one of these three slots is also empty. Suppose file new does not exist in the merge base and is added in both ours and theirs. Then :1:new is left empty and :2:new and :3:new record the add/add conflict. Or, suppose file f does exist in the base, is modified in our HEAD branch, and is removed in their branch. Then :1:f records the base file, :2:f records our version of the file, and :3:f is empty, recording the modify/delete conflict.

For modify/modify conflicts, all three slots are occupied; only when one file is missing is one of these slots empty. It's logically impossible to have two empty slots: there's no such thing as a delete/delete conflict, nor a nocreate/add conflict. But there is some weirdness with rename conflicts, which I've omitted here as this answer is long enough! In any case, it's the very existence of some value(s) in slots 1, 2, and/or 3 that mark the file as unresolved.

Merge, part 3: finishing the merge

Once all files are resolved—all entries are only in the zero-numbered slots—you can git commit the merge result. If git merge is able to do the merge without assistance, it normally runs git commit for you, but the actual commit is still done by running git commit.

The commit command works the same way as it always does: it turns the index contents into tree objects and writes a new commit. The only thing special about a merge commit is that it has more than one parent commit ID.³ The extra parents come from a file git merge leaves behind. The default merge message also comes from a file (a separate file in practice, although in principle they could have been combined).

Note that in all cases, the new commit's contents are determined by the index's contents. Moreover, once the new commit is done, the index is still full: it still contains the same contents. By default, git commit won't make another new commit at this point because it sees that the index matches the HEAD commit. It calls this "empty" and requires --allow-empty to make an extra commit, but the index is not empty at all. It's still quite full—it just is full of the same thing as the HEAD commit.

³This assumes you are making a real merge, not a squash merge. When making a squash merge, git merge deliberately does not write the extra parent ID to the extra file, so that the new merge commit has only a single parent. (For some reason, git merge --squash also suppresses the automatic commit, as if it included the --no-commit flag as well. It's not clear why, since you could just run git merge --squash --no-commit if you want the automatic commit suppressed.)

A squash merge does not record its other parent(s). This means that if we go to merge again, some time later, Git won't know where to start the diffs from. This means you should generally only squash-merge if you plan to abandon the other branch. (There are some tricky ways to combine squash merges and real merges but they're well out of the scope of this answer.)

How `git checkout branch` uses the index

With all that out of the way, we then have to look at how git checkout uses Git's index, too. Remember, in normal usage, only slot zero is occupied, and the index has one entry for every staged file. Moreover, that entry matches the current (HEAD) commit unless you've modified the file and git add-ed the result. It also matches the file in the work-tree unless you've modified the file.⁴

If you are on some branch and you git checkout some other branch, Git tries to switch to the other branch. For this to succeed, Git has to replace the index entry for each file with the entry that goes with the other branch.

Let's say, just for concreteness, that you're on master and you are doing git checkout branch. Git will compare each current index entry with the index entry it would need to be on the tip-most commit of branch branch. That is, for file README.txt, are the master contents the same as those for branch, or are they different?

If the contents are the same, Git can take it easy and just move on to the next file. If the contents are different, Git has to do something to the index entry. (It's around this point that Git checks to see if the work-tree file differs from the index entry, too.)

Specifically, in the case where branch's file differs from master's, git checkout has to replace the index entry with the version from branch—or, if README.txt doesn't exist in the tip commit of branch, Git has to remove the index entry. Moreover, if git checkout is going to modify or remove the index entry, it also needs to modify or remove the work-tree file. Git makes sure this is a safe thing to do, i.e., that the work-tree file matches the master commit's file, before it will let you switch branches.

In other words, this is how (and why) Git finds out whether it's OK to change branches—whether you have modifications that would be clobbered by switching from master to branch. If you have modifications in your work-tree, but the modified files are the same in both branches, Git can just leave the modifications in the index and work-tree. It can and will alert you to these modified files "carried over" into the new branch: easy, since it had to check for this anyway.

Once all the tests have passed and Git has decided that it's OK to switch from master to branch—or if you specified --force—git checkout actually updates the index with all the changed (or removed) files, and updates the work-tree to match.

Note that all this action has used slot zero. There are no slot 1-3 entries at all, so that git checkout does not have to remove any such things. You're not in the middle of a conflicted merge, and you ran git checkout branch to not just check out one file, but rather an entire set of files and switch branches.

Note also that you can, instead of checking out a branch, check out a specific commit. For instance, this is how you might look at a previous commit:

$ git log
... peruse log output ...
$ git checkout f17c393 # let's see what's in this commit

Run Code Online (Sandbox Code Playgroud)

The action here is the same as for checking out a branch, except that instead of using the tip commit of the branch, Git checks out an arbitrary commit. Instead of now being "on" the new branch, you're now on no branch:⁵ Git gives you a "detached HEAD". To reattach your head, you must git checkout master or git checkout branch to get back "on" the branch.

⁴The index entry may not match the work-tree version if Git is doing special CR-LF ending modifications, or applying smudge filters. This gets pretty advanced and the best thing is to ignore this case for now. :-)

⁵More accurately, this puts you on an anonymous (unnamed) branch that will grow from the current commit. You will stay in detached HEAD mode if you make new commits, and as soon as you git checkout some other commit or branch, you'll switch there and Git will "abandon" the commits you've made. The point of this detached HEAD mode is both to let you look around and to let you make new commits that will just go away if you don't take special action to save them. For anyone relatively new to Git, though, having commits "just go away" is not so good—so make sure you know that you're in this "detached HEAD" mode, whenever you are in it.

The git status command will tell you if you're in detached HEAD mode. Use it often.⁶ If your Git is old (the OP's is 1.7.1, which is very old now), git status is not as helpful as it is in modern versions of Git, but it's still better than nothing.

⁶Some programmers like to have key git status information encoded into each command-prompt. I personally do not go this far, but can be a good idea.

Checking out specific files, and why it sometimes resolves merge conflicts

The git checkout command has other modes of operation, though. In particular, you can run git checkout [flags etc] -- path [path ...] to check out specific files. This is where things get weird. Note that when you use this form of the command, Git does not check to make sure you are not overwriting your files.⁷

Now, instead of changing branches, you're telling Git to get some particular file(s) from somewhere, and drop them into the work-tree, overwriting whatever is there, if anything. The tricky question is: just where is Git getting these files?

Generally speaking, there are three places that Git keeps files:

in commits;⁸
in the index;
and in the work-tree.

The checkout command can read from either of the first two places, and always writes the result to the work-tree.

When git checkout gets a file from a commit, it first copies it to the index. Whenever it does this, it writes the file to slot zero. Writing to slot zero wipes out slots 1-3, if they are occupied. When git checkout gets a file from the index, it does not have to copy it to the index. (Of course not: it's already there!) This is how git checkout works when you are not in the middle of a merge: you can git checkout -- path/to/file to get the index version back.⁹

Suppose, though, that you are in the middle of a conflicted merge and are going to git checkout some path, maybe with --ours. (If you are not in the middle of a merge, there's nothing in slots 1-3, and --ours makes no sense.) So you run git checkout --ours -- path/to/file.

This git checkout gets the file from the index—in this case, from index slot 2. Since this is already in the index, Git does not write to the index, just to the work-tree. So the file is not resolved!

The same goes for git checkout --theirs: it gets the file from the index (slot 3), and does not resolve anything.

But: if you git checkout HEAD -- path/to/file, you are telling git checkout to extract from the HEAD commit. Since this is a commit, Git starts by writing the file contents to the index. This writes slot 0 and erases 1-3. And now the file is resolved!

Since, during a conflicted merge, Git records the being-merged commit's ID in MERGE_HEAD, you can also git checkout MERGE_HEAD -- path/to/file to get the file from the other commit. This, too, extracts from a commit, so it writes to the index, resolving the file.

⁷I often wish Git used a different front-end command for this, since we could then say, unequivocally, that git checkout is safe, that it won't overwrite files without --force. But this kind of git checkout does overwrite files, on purpose!

⁸This is a bit of a lie, or at least a stretch: commits don't contain files directly. Instead, commits contain a (single) pointer to a tree object. This tree object contains the IDs of additional tree objects and of blob objects. The blob objects contain the actual file contents.

The same is, in fact, true of the index as well. Each index slot contains, not the actual file contents, but rather the hash IDs of blob objects in the repository.

For our purposes, though, this doesn't really matter: we just ask Git to retrieve commit:path and it finds the trees and the blob ID for us. Or, we ask Git to retrieve :n:path and it finds the blob ID in the index entry for path for slot n. Then it gets us the file's contents, and we're good to go.

This colon-and-number syntax works everywhere in Git, while the --ours and --theirs flags only work in git checkout. The funny colon syntax is described in gitrevisions.

⁹The use-case for git checkout -- path is this: suppose, whether or not you are merging, you made some changes to a file, tested, found those changes worked, then ran git add on the file. Then you decided to make more changes, but have not run git add again. You test the second set of changes and find they are wrong. If only you could get the work-tree version of the file set back to the version you git add-ed just a moment ago.... Aha, you can: you git checkout -- path and Git copies the index version, from slot zero, back to the work-tree.

Subtle behavior warning

Note, though, that using --ours or --theirs has another slight subtle difference besides just the "extract from index and therefore don't resolve" behavior. Suppose that, in our conflicted merge, Git has detected that some file was renamed. That is, in the merge base, we had file doc.txt, but now in HEAD we have Documentation/doc.txt. The path we need for git checkout --ours is Documentation/doc.txt. This is also the path in the HEAD commit, so it's OK to git checkout HEAD -- Documentation/doc.txt.

But what if, in the commit we're merging, doc.txt did not get renamed? In this case, we should¹⁰ be able to git checkout --theirs -- Documentation/doc.txt to get their doc.txt from the index. But if we try to git checkout MERGE_HEAD -- Documentation/doc.txt, Git won't be able to find the file: it's not in Documentation, in the MERGE_HEAD commit. We have to git checkout MERGE_HEAD -- doc.txt to get their file ... and that would not resolve Documentation/doc.txt. In fact, it would just create ./doc.txt (if it was renamed there's almost certainly no ./doc.txt, hence "create" is a better guess than "overwrite").

Because merging uses HEAD's names, it's generally safe enough to git checkout HEAD -- path to extract-and-resolve in one step. And if you're working on resolving files and have been running git status, you should know whether they have a renamed file, and therefore whether it's safe to git checkout MERGE_HEAD -- path to extract-and-resolve in one step by discarding your own changes. But you should still be aware of this, and know what to do if there is a rename to be concerned with.

¹⁰I say "should" here, not "can", because Git currently forgets the rename a little bit too soon. So if using --theirs to get a file that you renamed in HEAD, you have to use the old name here too, and then rename the file in the work-tree.

这可能是我见过的最被低估的帖子之一.这应该是维基! (17认同)
我在一周前找到这篇文章,已经三次回来了.被低估的这句话太弱了! (3认同)

归档时间：	9 年，3 月前
查看次数：	13126 次
最近记录：	7 年，4 月前

git checkout --ours不会从未合并文件列表中删除文件

合并,第1部分:合并如何开始

合并,第2部分:因冲突而停止,以及Git的"索引"

Merge, part 3: finishing the merge

How git checkout branch uses the index

Checking out specific files, and why it sometimes resolves merge conflicts

Subtle behavior warning

How `git checkout branch` uses the index