use*_*376 5 git branching-and-merging
我有时会发现自己处于简单的情况,我正在进行一些更改并创建一个分支.当我在我的变化中移动时,我开始发现需要清理的一些事情或者我想要开始进入的一些其他部分相关的事情.所以,我想保持特定的分支,所以我快速分离另一个分支,开始处理可能与前一个分支没有相同依赖关系的另一组更改.最后,我最终得到了两个分支,我试图隔离这些变化,但是,这个第二个分支起源于第一个分支,其中术语来自"主".
我可以(并且已经)通过将'master'合并到每个分支中来单独更新每个分支,并且希望将第二个分支准备好合并到'master'中,因为它具有比创建的第一个分支更少的依赖性.但是,此分支还包含自第一个分支分离后所做的更改.
所以我想知道,有没有办法告诉Git类似:"删除这个其他分支中存在的所有提交"这样我就离开了我的第二个分支而没有在第一个分支中完成所有更改,允许我将第二个分支合并为"master",让我回到我创建的第一个分支上工作.
我可能只是在Git中找不到合适的术语,看看它是如何做到的.但是,也许它不能.看起来它应该是非常可行的,看看Git如何很好地向我展示分支1和2之间的适当差异,即使在我从'master'单独更新两个分支之后也是如此.
并且从分支中"删除"是没有必要的..即使这个想法正在创建另一个分支但是仍然以某种方式排除在第一个分支中也在第二个分支中完成的更改就足够了.
tor*_*rek 13
是的,你可以这样做.在某些情况下,它甚至可以轻松实现,因为它只是自动完成git rebase.在某些情况下,这是非常艰难的.我们来看看这些案例.
首先,绘制提交图是至关重要的,因为它几乎总是在Git中.为了实现这一目标,我们首先回顾一下Git的基础知识.(这是一个好主意,因为很多Git教程都跳过了基础知识,因为基础知识很无聊和令人困惑.:-))首先,让我们看一下提交是什么,为你做什么.
一提交,Git中,完全是一个具体的东西.我们可以看一下任何实际的提交 - 其中大多数都非常小 - 不是很好git show,它们很多,但git cat-file -p它们显示了直接的原始内容(好吧,tree对象需要进行小的调整,所以有时"大多数是原始的")一个实际的Git对象:
$ git cat-file -p 3bc53220cb2dcf709f7a027a3f526befd021d858
tree 5654dad720d5b0a8177537390575cd6171c5fc50
parent 3e5c63943d35be1804d302c0393affc4916c3dc3
author Junio C Hamano <gitster@pobox.com> 1488233064 -0800
committer Junio C Hamano <gitster@pobox.com> 1488233064 -0800
First batch after 2.12
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Run Code Online (Sandbox Code Playgroud)
那就是整个提交.它的名称 - 标识该提交的一个名称,从现在开始直到永远 - 3bc5322...(如果人们可以避免它,人类永远不想处理的一些大丑陋的哈希ID).它存储了几个更大的丑陋哈希ID.一个用于树,一些数字 - 通常只有一个 - 用于父母.它有一个作者(姓名,电子邮件地址和时间戳)和提交者,他们通常是相同的; 它有一条日志消息,无论你想写什么.
该树连接到提交是源树快照.这是整个事情,而不是一系列变化.(在下面,Git 确实通过压缩变得聪明,但是树的哈希ID会获取文件的哈希ID,而这些文件是完整的文件,而不是一些奇怪的压缩事物.)让Git提取那棵树,然后你得到所有的文件.
因为每个提交都存储父哈希ID,所以我们可以从最近的提交开始并向后工作.那是你的Git:倒退.我们从最近提交的哈希ID开始,我们在分支名称中为我们保存了Git.我们说这个分支名称指向提交:
<--C <--master
Run Code Online (Sandbox Code Playgroud)
该名称master指向提交C.(我使用一个字母名称而不是大丑陋的哈希ID,这限制了我提交26次但更方便.)C但是,Commit中有另一个哈希ID,所以C指向另一个提交.那C是父母的B. B当然也指向另一个提交,但是让我们说我们的存储库总共只有三个提交,所以B返回指向A但是A第一个提交.
既然A 是第一个,就不能有父母.所以它没有:它没有进一步指出.我们称A一个根提交.每个存储库至少有一个(通常只有一个)根提交.1 这就是行动必须停止的地方:我们(或Git)不能再回头了.
无论如何,一旦提出,提交是永久性的,不变的.2 这是因为它们的哈希ID是通过计算提交中所有位的加密哈希值(您看到的所有内容git cat-file -p)来实现的.如果您更改了任何内容,则会获得一个新的不同的哈希ID.每个哈希ID始终是唯一的.3
所以,让我们把它画出来,但不要打扰内部箭头; 让我们保留一个用于分支名称本身.
A--B--C <-- master
Run Code Online (Sandbox Code Playgroud)
然后,每个提交都会为您保存快照.当你将它们与它们的向后箭头组合在一起时,就会得到提交图.
1,除了,一个完全空的存储库,显然没有提交.这就是你首先通过提交没有父级的提交来获得root提交的方法.
2然而,一旦你对它们没有用处,它们就可以被垃圾收集.Git通常无形地做到这一点; 我们会很快看到它是如何产生的.
3不要关注窗帘后面的网站! 但是,严重的是,最近SHA-1哈希的破坏对Git来说不是一个直接的问题,但它确实有助于推动Git切换到SHA-256.
现在我们看到图表看起来如何通过三次提交,让我们添加一个新的提交master,看看它是如何工作的.首先,我们将git checkout master照常进行.这填写了索引和工作树.然后我们会工作,git add东西和git commit.
(提醒:工作树就是你工作的地方.当Git保存文件时,它会将它们列在不可发音的哈希ID名称下,并将它们存储为压缩文件,从而将它们保存在仅对Git本身有用的形式中.要使用这些文件,你需要它们的正常形式,那就是工作树.同时索引是你和Git构建下一个提交的地方.你在工作树中处理文件,然后运行git add从中复制它们工作树,进入索引.你可以git add随时:只是再次从工作树更新索引.索引开始匹配当前提交,然后你修改它直到你准备好进行新的提交.)
当你运行时git commit,Git会收集你的日志消息,然后:
tree:这是您保存的快照,基于您在工作树的索引中替换的内容.新树获得自己的哈希ID.commit使用此新树ID,当前提交的ID作为其parent作为作者和提交者(现在作为时间戳)和您的日志消息写入一个新对象.第2步为Git提供了新提交的新哈希ID; 我们称之为D.由于新提交中包含C哈希ID,因此请D回到C:
A--B--C <-- master (HEAD)
\
D
Run Code Online (Sandbox Code Playgroud)
The last thing Git does, though, is to write D's ID into the current branch name. If the current branch is master, this makes master point to D:
A--B--C
\
D <-- master (HEAD)
Run Code Online (Sandbox Code Playgroud)
If we git checkout -b some new branch first, though—just before we make the new commit, that is—then look at our new starting setup:
A--B--C <-- branch (HEAD), master
Run Code Online (Sandbox Code Playgroud)
Both names, branch and master, point to C, but HEAD says that we are on branch branch, not on master. So when we make D and Git updates the current branch, we get this:
A--B--C <-- master
\
D <-- branch (HEAD)
Run Code Online (Sandbox Code Playgroud)
This is how branches grow. A branch name just points to the tip commit of a branch; it's the commits themselves that form the graph.
It's worth stopping at this point and thinking about commits A-B-C. They're on master, for sure. But they are also on branch branch. A commit, in Git, may be on many branches at the same time. What we need to do, quite often, is limit how far back we let Git go when we tell it: "Get me all the commits starting from this branch-tip and working backwards."
Well, maybe exciting. :-) You have made several branches with a bunch of new commits, so let's draw that:
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q <-- feature2
Run Code Online (Sandbox Code Playgroud)
Here, master ends at G, i.e., commit G is the tip of master. feature1 ends at L, and feature2 ends at Q. Commits E-F-G are on all three branches. Commits P-Q are only on feature2. Commits I-J-K are on both feature1 and feature2. Commit L is only on feature1.
Remember, again, these letters stand in for big ugly hash IDs, where the actual hash ID encodes everything in the commit: the saved tree and the parent IDs. So L requires K's hash ID, for instance. This kind of thing matters because we intend to copy some commits.
What you described wanting to do is to somehow transplant commits P and Q so that they sit atop master. What if there were a way to copy commits? It turns out that there is: it's called git cherry-pick.
Remember that we noted earlier that a commit is a snapshot. It's not a set of changes. But right now we wish that a commit were a set of changes, because commit P is a lot like its parent commit K, but with some changes made. After all, you made P by having K checked out, then editing files and git adding the new versions into the index and then git committing.
Fortunately, there's an easy4 way to turn a snapshot into a changeset, by comparing it (git diff) against its parent commit. The output from git diff is a minimal5 set of instructions: "Remove this line from this file, add these other lines to that file, etc." Applying those instructions to the tree in K will turn it into the tree in P.
But what happens if we apply those instructions to some other tree? As it turns out, this often "just works". We can git checkout commit G—the tip commit of branch master, but let's use a different branch name:
...--E--F--G <-- master, temp (HEAD)
\
I--J--K--L <-- feature1
\
P--Q <-- feature2
Run Code Online (Sandbox Code Playgroud)
and then apply the diff to the work-tree. We'll assume that goes well, automatically git add the result to the index, and git commit while copying the log message from commit P. We'll call the new commit P' to say "like P, but with a different hash ID" (because it has a different tree, and a different parent):
P' <-- temp (HEAD)
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q <-- feature2
Run Code Online (Sandbox Code Playgroud)
Now let's repeat this with Q. We run git diff P Q to turn Q into changes, apply those changes to P', and commit the result as new Q':
P'-Q' <-- temp (HEAD)
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q <-- feature2
Run Code Online (Sandbox Code Playgroud)
This is just the two git cherry-pick steps, plus of course creating the temporary branch. But look what happens now if we erase the old name feature2 and change temp to feature2:
P'-Q' <-- feature2 (HEAD)
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q [abandoned]
Run Code Online (Sandbox Code Playgroud)
It now looks like we made feature2 by doing git checkout -b feature2 master, then writing P' and Q' from scratch! That's just what you wanted.
4Easy, that is, after any number of master's and/or PhD theses on string-to-string edit problems.
5"Minimal" in some sense, and somewhat tweak-able through different diff algorithms. Minimizing the edit distance is important for compression but not actually for correctness. However, when we go to apply the edit instructions to some other tree, the minimal-ness, and the exact instructions, really start to matter.
rebase is automated cherry-pick plus the branch label movingWe can do all of the above at once using:
git checkout feature2
git rebase --onto master feature1
Run Code Online (Sandbox Code Playgroud)
What we are doing here is using feature1 as a way to tell Git what to stop copying. Look back at the original graph, before the abandonment of the original commits. If we tell Git to start at feature1 and work backwards, that identifies commits L, K, J, I, G, F, and so on. Those are the commits we explicitly say not to copy: commits that are on branch feature1.
Meanwhile, the commits we do want to copy are those on feature2: Q, P, K, J, and so on. But we stop as soon as we hit any of the forbidden ones, so we'll copy only the P-Q commits.
The place we tell git rebase to copy to is—or is "just after"—the tip of master, i.e., copy commits so that they come after G.
Git rebase does it all for us, which is ridiculously easy. But there could be a hitch—or maybe several.
Let's say that we start out with this as before:
...--E--F--G <-- master (HEAD)
\
I--J--K--L <-- feature1
\
P--Q <-- feature2
Run Code Online (Sandbox Code Playgroud)
We'd like to rebase feature2 onto master, skipping most of feature1, but it turns out we need what we changed in commit J too.
We don't need I, or K, or L, just J (plus of course P and Q).
We can't do this with just git rebase. We may need an explicit git cherry-pick, to copy J. But this is Git, so there are lots of ways to do this.
First, let's look at the explicit-cherry-pick method. We'll go ahead and make a new branch and cherry-pick J:
git checkout -b temp
git cherry-pick <hash-ID-of-J>
Run Code Online (Sandbox Code Playgroud)
Now we have:
J' <-- temp (HEAD)
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q <-- feature2
Run Code Online (Sandbox Code Playgroud)
Now we can transplant P-Q as before. We just change the --onto directive:
git checkout feature2
git rebase --onto temp feature1
Run Code Online (Sandbox Code Playgroud)
The result is:
P'-Q' <-- feature2 (HEAD)
/
J' <-- temp
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q [abandoned]
Run Code Online (Sandbox Code Playgroud)
We don't need temp any more at all, so we can just git branch -d temp and straighten out our drawing:
J'-P'-Q' <-- feature2 (HEAD)
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q [abandoned]
Run Code Online (Sandbox Code Playgroud)
Suppose that, instead of copying just P-Q, we let git rebase copy I-J-K-P-Q. This might actually be easier:
git checkout feature2
git rebase master
Run Code Online (Sandbox Code Playgroud)
This time we don't need --onto: master tells Git both which commits to leave out and where to put the copies. We leave out commit G and earlier, and we copy after G. The result is:
I'-J'-K'-P'-Q' <-- feature2
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q [abandoned]
Run Code Online (Sandbox Code Playgroud)
Now we have too many commits copied, but now we run:
git rebase -i master
Run Code Online (Sandbox Code Playgroud)
which gives us a bunch of "pick" lines for each commit I', J', K', P', and Q'. We delete the ones for I' and K'. Git now copies again, giving:
J''-P''-Q'' <-- feature2
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q [abandoned]
Run Code Online (Sandbox Code Playgroud)
which is what we want (the original copies are still in there, abandoned like the original-original P-Q, but they were there for so little time, who cares anyway? :-) ). And of course, we can make that first git rebase use -i and remove the pick lines, and just have the J'-P'-Q' copies, all in one step.
That's fine as far as it goes, but now there is both a J and a J'. Actually there's nothing wrong with this—you can leave this situation in place, and even merge with it like this, with no real harm done. But you might want to make J' part of master first and then share it.
Again, there is more than one way to do this. I want to illustrate one particular way, though, because git rebase has some magic in it.
Let's say that we have done the feature2 rebasing so that we have this now. We'll drop the abandoned commits entirely, just like Git does when it eventually gets around to garbage-collecting them (note: you get at least 30 days by default before this happens, giving you about a month to change your mind):
J'-P'-Q' <-- feature2
/
...--E--F--G <-- master
\
I--J--K--L <-- feature1
Run Code Online (Sandbox Code Playgroud)
You can now fast-forward master to include J':
git checkout master
git merge --ff-only <hash-id-of-J'>
Run Code Online (Sandbox Code Playgroud)
This moves the labels, without changing the commit graph. To make it easy to draw in ASCII text, though, I'll move J' down one row:
P'-Q' <-- feature2
/
...--E--F--G--J' <-- master
\
I--J--K--L <-- feature1
Run Code Online (Sandbox Code Playgroud)
(We could also get here by explicitly git cherry-picking J into master originally, then rebasing feature2 without any fancy footwork.) So now we'd like to copy feature1's commits, adding them after J', and removing J.
We could do this with another git rebase -i, which lets us explicitly delete the original commit J. But we don't have to. Well, we don't have to, most of the time. Instead, we just run:
git checkout feature1
git rebase master
Run Code Online (Sandbox Code Playgroud)
This tells Git that it should consider I-J-K-L as the candidates for the copy, and put the copies after J' (where master now points). But—here's the magic—git rebase looks closely at all6 of the commits that are on master that are not on feature1 (these are called the upstream commits, in at least a few bits of documentation). In this case, that's J' itself. For each such commit, Git diffs the commit against its parent (a la git cherry-pick) and turns the result into a patch ID. It does the same with each candidate commit. If one of the candidates (J) has the same patch ID as one of the upstream commits, Git eliminates the candidate from the list!
Hence, as long as both J and J' have the same patch ID, Git automatically drops J, so that the final result is:
P'-Q' <-- feature2
/
...--E--F--G--J' <-- master
\
I'-K'-L' <-- feature1
Run Code Online (Sandbox Code Playgroud)
which is just what we wanted.
6All, that is, except merges. Rebase literally can't copy merges—a new merge has a different parent-set than the original, and cherry picking "undoes" a merge in the first place—so by default it skips them entirely. Merges don't get a patch ID assigned, and don't get plucked out of the set because they were never in the set. It's usually a bad idea to rebase a graph-fragment that contains a merge.
Git does have a mode that tries to do it. This mode re-performs the merges (as it has to: I leave working out the details as an exercise). But there are a bunch of dangers here, so it's usually best not to do this at all. I have said before that probably git rebase should default to "preserving" merges, but by erroring-out if there are merges, requiring either a "yes, go ahead and try to re-create merges" flag, or a "flatten away and remove merges" flag to proceed.
It doesn't, though, so it's up to you to draw the graph and make sure your rebases make sense.
Any time you git rebase some commits, you run the risk of merge conflicts. This is particularly true if you are plucking a segment of commits out of a long chain:
o--...--B--1--2--3--4--...--o <-- topic
/
...o--*--o--o--o--T <-- develop
Run Code Online (Sandbox Code Playgroud)
If we want to "move" (copy, then remove) commits 1-4 into develop, there's a good chance some or all parts of some of those four commits depend, in some way, on the other top-row commits that come before them (B and earlier). When that happens, we tend to get merge conflicts, sometimes many. Git winds up viewing the copy of commit 1 as a three-way merge operation, merging the changes from B to 1 with the changes from B to T. The changes "from" B to T may looks quite complex, and may not appear sensible out of context, because we have to "go backwards" through the commits before B down to * and then "forwards" up to T.
It is up to you to figure out how, or even whether it is wise, to do this.
Because rebase is fundamentally a copy operation, you must consider who might still have the original commits. Since commits can be on many branches, it may be the case that you have the originals. This is what we saw when we had both J and J', for instance.
Sometimes—even somewhat often—this may not be a big deal. Sometimes it is. If all the extra copies are only in your own repository, you can resolve all this on your own. But what happens if you have published (pushed, or let others fetch from you) some of your commits? In particular, what if some other repository has those original commits, with their original hash IDs? If you have published the original commits, you must tell everyone else who has them: "Hey, I'm abandoning the originals, I have shiny new copies elsewhere." You must get them to do the same thing, or else put up with the extra commit copies.
Extra commits are sometimes harmless. This is particularly true of merges, since git merge works hard to take only one copy of any given change (although Git cannot always get this quite right on its own, since each change—each git diff output—depends on context and other changes, and the minimal-edit-distance algorithms sometimes go wrong themselves, picking the wrong "minimum changes"). Even if they do not break the tree, though, they do clutter up the commit history. Whether and when this might be a problem is hard to predict.
For your goals, git rebase is a powerful tool. It needs a bit of care when using it, and the most important thing to remember is that it copies commits, then abandons—or tries to abandon—the originals.
如果您的历史记录如下:
...--E--F--G <-- master
\
I--J--K--L <-- feature1
\
P--Q <-- feature2
Run Code Online (Sandbox Code Playgroud)
在简单的情况下删除feature1然后执行以下操作:
git checkout feature2
git rebase --onto master feature1
Run Code Online (Sandbox Code Playgroud)
这是 的答案的缩写@torek,这是现象级的,但很难找到问题的实际答案。请阅读@torek的答案以获取更多详细信息以及在非简单情况下该怎么做。