如何改变分支的起点?

Dan*_*obe 14 git github git-branch

通常我通过运行像这样的命令创建一个分支git checkout -b [branch-name] [starting-branch].在一个案例中,我忘了包括starting-branch,现在我想纠正它.如何在创建分支后执行此操作?

tor*_*rek 36

The short answer is that once you have some commits, you want to git rebase them, using the long form of git rebase: git rebase --onto newbase upstream. To find out how to identify each of these, see the (very) long answer below. (Unfortunately, it got a bit out of hand and I do not have time to shorten it.)

The problem you have here is that in Git, branches don't have a "starting point"—at least, not in any useful way.

The term "branch", in Git, is ambiguous

The first issue here is that, in Git, the word "branch" has at least two distinct meanings. Usually, when we talk loosely about "the branch", it's clear from context whether we mean the branch name—the thing that's a word like master or develop or feature-X—or the thing that I call "branch ancestry" or "branch structure", or more informally, a "DAGlet".1 See also What exactly do we mean by "branch"?

In this particular case, unfortunately, you mean both of these, at the same time.


1The term DAG is short for Directed Acyclic Graph, which is what the commit graph is: a set of vertices or nodes, and directional (from child to parent) edges, such that there are no cycles through the directed edges from any node back to itself. To this I simply add the "-let" diminutive suffix. The resulting word has a happy resemblance to the word aglet, plus a certain assonance with the word "dagger", making it sound slightly dangerous: "Is this a DAGlet which I see before me?"

Draw your commit graph

Whenever you need to grapple with these issues, it helps to draw a graph of what you have now, or at least some useful subset of what you have now. There are of course many ways to draw this (see that linked question for several options, including some bad ones :-) ), but in plain text in a StackOverflow answer, I generally draw them like this:

...--o--o--o           <-- master
         \
          o--o--o--o   <-- develop
Run Code Online (Sandbox Code Playgroud)

The round o nodes represent commits, and the branch names master and develop point to one specific tip commit on each branch.

In Git, every commit points back to its parent commit(s), and this is how Git forms branch structures. By "branch structures", I mean here particular subsets of the overall ancestry part of the graph, or what I call the DAGlets. The name master points to the tip-most commit of the master branch, and that commit points back (leftward) to another commit that is the previous commit on the branch, and that commit points leftward again, and so on.

When we need to talk about specific commits within this graph, we can use their actual names, which are the big ugly 40-character hashes that identify each Git object. Those are really clumsy though, so what I do here is replace the little round os with uppercase letters:

...--A--B--C           <-- master
         \
          D--E--F--G   <-- develop
Run Code Online (Sandbox Code Playgroud)

and now it's easy to say, e.g., that the name master points to commit C, and C points to B, and B points to A, which points back to more history that we don't really care about and hence just left as ....

Where does a branch begin?

Now, it's perfectly obvious, to you and me, based on this graph drawing, that branch develop, whose tip commit is G, starts at commit D. But it's not obvious to Git—and if we draw the same graph a little differently, it may be less obvious to you and me too. For instance, look at this drawing:

          o             <-- X
         /
...--o--o--o--o--o--o   <-- Y
Run Code Online (Sandbox Code Playgroud)

Obviously branch X has just the one commit and the main line is Y, right? But let's put some letters in:

          C             <-- X
         /
...--A--B--D--E--F--G   <-- Y
Run Code Online (Sandbox Code Playgroud)

and then move Y down a line:

          C            <-- X
         /
...--A--B
         \
          D--E--F--G   <-- Y
Run Code Online (Sandbox Code Playgroud)

再看看,如果我们移动会发生什么C下降为主线,并认识到,XmasterYdevelop?毕竟哪个分支承诺的B

在Git中,提交可能同时在许多分支上; DAGlets由您决定

Git对这种困境的回答是提交A并且B都在两个分支上.分支的开头X是在左边的...部分. 但分支的开始也是如此Y. 就Git而言,分支"启动"它可以在图中找到的任何根提交.

一般来说,这一点很重要.Git没有关于分支"开始"的真正概念,所以我们不得不给它额外的信息.有时这些信息是隐含的,有时它是明确的.一般来说,记住提交通常在许多分支上也很重要- 因此我们通常不是指定分支,而是指定提交.

我们经常使用分支名称来执行此操作.但是如果我们给Git一个分支名称,并告诉它找到该分支的提示的所有祖先,Git一直追溯到历史.

In your case, if you write the name develop and ask Git to select that commit and its ancestors, you get commits D-E-F-G (which you wanted) and commit B, and commit A, and so on (which you didn't). The trick, then, is to somehow identify which commits you don't want, along with which commits you do.

Normally we use the two-dot X..Y syntax

With most Git commands, when we want to select some particular DAGlet, we use the two-dot syntax described in gitrevisions, such as master..develop. Most2 Git commands that work on multiple commits treat this as: "Select all commits starting from the tip of the develop branch, but then subtract from that set, the set of all commits starting from the tip of the master branch." Look back at our graph drawing of master and develop: this says "do take commits starting from G and working backwards"—which gets us too many, since it includes commits B and A and earlier—"but exclude commits starting from C and working backwards." It's that exclude part that gets us what we want.

Hence, writing master..develop is how we name commits D-E-F-G, and have Git compute that automatically for us, without having to first sit down and draw out a big chunk of the graph.


2Two notable exceptions are git rebase, which is in its own section just below, and git diff. The git diff command treats X..Y as simply meaning X Y, i.e., it effectively just ignores the two dots entirely. Note that this has a very different effect than set subtraction: in our case, git diff master..develop simply diffs the tree for commit C against the tree for commit G, even though master..develop never has commit C in the first set.

In other words, mathematically speaking, master..develop is normally ancestors(develop) - ancestors(master), where the ancestors function includes the specified commit, i.e., is testing ? rather than just <. Note that ancestors(develop) does not include commit C at all. The set subtraction operation simply ignores the presence of C in the set ancestors(master). But when you feed this to git diff, it does not ignore C: it does not diff, say, B against G. While that might be a reasonable thing to do, git diff instead steals the three-dot master...develop syntax to accomplish this.

Git's rebase is a little bit special

The rebase command is almost always used to move3 one of these DAGlet commit-subsets from one point in the graph to another. In fact, that's what rebase is, or was originally anyway, defined to do. (Now it has a fancy interactive rebase mode, which does this and a bunch more history editing operations. Mercurial has a similar command, hg histedit, with a slightly better name, and much tighter default semantics.4)

Since we always (or almost always) want to move a DAGlet, git rebase builds in this subset selection for us. And, since we always (or almost always) want to move the DAGlet to come just after the tip of some other branch, git rebase defaults to choosing the target (or --onto) commit using a branch name, and then uses that same branch name in the X..Y syntax.5


3Technically, git rebase actually copies commits, rather than moving them. It has to, because commits are immutable, like all Git's internal objects. The true name, the SHA-1 hash, of a commit is a checksum of the bits making up the commit, so any time you change anything—including something as simple as the parent ID—you have to make a new, slightly-different, commit.

4In Mercurial, quite unlike Git, branches really do have starting points, and—more important for histedit—commits record their phase: secret, draft, or published. History editing readily applies to secret or draft-phase commits, and not so much to published commits. This is true of Git as well, but since Git has no concept of commit phases, Git's rebase must use these other techniques.

5Technically the <upstream> and --onto arguments can just be raw commit IDs. Note that 1234567..develop works just fine as a range selector, and you can rebase --onto 1234567 to place the new commits after commit 1234567. The only place that git rebase truly needs a branch name is for the name of the current branch, which it normally just reads from HEAD anyway. However, we usually want to use a name, so that's how I describe it all here.


That is, if we're currently on branch develop, and in this situation that we drew before:

...--A--B--C           <-- master
         \
          D--E--F--G   <-- develop
Run Code Online (Sandbox Code Playgroud)

we probably just want to move the D-E-F-G chain onto the tip of master, to get this:

...--A--B--C              <-- master
            \
             D'-E'-F'-G'  <-- develop
Run Code Online (Sandbox Code Playgroud)

(The reason I changed the names from D-E-F-G to D'-E'-F'-G' is that rebase is forced to copy the original commits, rather than actually moving them. The new copies are just as good as the originals, and we can use the same single letter name, but we should at least note, however vaguely, that these are in fact copies. That's what the "prime" marks, the ' characters, are for.)

Because this is what we usually want, git rebase will do this automatically if we just name the other branch. That is, we're on develop now:

$ git checkout develop
Run Code Online (Sandbox Code Playgroud)

and we want to rebase commits that are on branch develop and are not on master, moving them to the tip of master. We might express this as git somecmd master..develop master, but then we would have to type the word master twice (such a dreadful fate). So instead, Git's rebase infers this when we just type in:

$ git rebase master
Run Code Online (Sandbox Code Playgroud)

The name master becomes the left side of the two-dot .. DAGlet selector, and the name master also becomes the target of the rebase; and Git then rebases D-E-F-G onto C. Git gets our branch's name, develop, by reading out the current branch name. In fact, it uses a shortcut, which is that when you need the current branch name, you can normally just write HEAD instead. So master..develop and master..HEAD mean the same thing, because HEAD is develop.

Git's rebase calls this name the <upstream>. That is, when we say git rebase master, Git claims, in the documentation, that master is the <upstream> argument to git rebase. The rebase command then operates on commits in <upstream>..HEAD, copying them after whatever commit is in <upstream>.

This is going to become a problem for us soon, but for now, just make note of it.

(Rebase also has the sneaky, but desirable, side feature of omitting any of the D-E-F-G commits that sufficiently resembles commit C. For our purposes we can ignore this.)

What's wrong with the other answer to this question

In case the other answer gets deleted, or becomes one of several other answers, I'll summarize it here as "use git branch -f to move the branch label." The flaw in the other answer—and, perhaps more importantly, precisely when it's a problem—becomes obvious once we draw our graph DAGlets.

Branch names are unique, but tip commits are not necessarily so

Let's take a look at what happens when you run git checkout -b newbranch starting-point. This asks Git to root around in the current graph for the given starting-point, and make the new branch label point to that specific commit. (I know I said above that branches don't have a starting point. This is still mostly true: we're giving the git checkout command a starting point now, but Git is about to set it and then, crucially, forget it.) Let's say that starting-point is another branch name, and let's draw a whole bunch of branches:

          o--o--o--o     <-- brA
         /
...--o--o--o--o--o--o    <-- brB
            \
             o--o--o     <-- brC
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

Since we have four branch names, we have four branch tips: four branch-tip commits, identified by the names brA through brD. We pick one and make a new branch name newbranch that points to the same commit as one of these four. I have arbitrarily picked brA here:

          o--o--o--o     <-- brA, newbranch
         /
...--o--o--o--o--o--o    <-- brB
            \
             o--o--o     <-- brC
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

We now have five names, and five ... er, four? ... well, some tip commits. The tricky bit is that brA and newbranch both point to the same tip commit.

Git knows—because git checkout sets it—that we're now on newbranch. Specifically Git writes the name newbranch into HEAD. We can make our drawing a bit more accurate by adding this information:

          o--o--o--o     <-- brA, HEAD -> newbranch
         /
...--o--o--o--o--o--o    <-- brB
            \
             o--o--o     <-- brC
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

At this point, the four commits that used to be only on branch brA are now on both brA and newbranch. And, by the same token, Git no longer knows that newbranch starts at the tip of brA. As far as Git is concerned, both brA and newbranch contain those four commits and all the earlier ones too, and both of them "start" way back in time somewhere.

When we make new commits, the current name moves

Since we're on branch newbranch, if we make a new commit now, the new commit's parent will be the old tip commit, and Git will adjust the branch name newbranch to point to the new commit:

                     o   <-- HEAD -> newbranch
                    /
          o--o--o--o     <-- brA
         /
...--o--o--o--o--o--o    <-- brB
            \
             o--o--o     <-- brC
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

Note that none of the other labels moved: the four "old" branches stay put, only the current (HEAD) branch changes. It changes to accommodate the new commit we just made.

Note that Git continues to have no idea that branch newbranch "started" at brA. It's just the case, now, that newbranch contains one commit that brA does not, plus the four commits that they both contain, plus all those earlier commits.

What git branch -f does

Using git branch -f lets us move a branch label. Let's say, for whatever mysterious reason, we don't want branch label brB to point where it does in our current drawing. Instead, we want it to point to the same commit as brC. We can use git branch -f to change the place to which brB points, i.e., to move the label:

$ git branch -f brB brC

                     o   <-- HEAD -> newbranch
                    /
          o--o--o--o     <-- brA
         /
...--o--o--o--o--o--o    [abandoned]
            \
             o--o--o     <-- brC, brB
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

This makes Git "forget" or "abandon" those three commits that were only on brB before. That's probably a bad idea—why did we decide to do this strange thing?—so we probably want to put brB back.

Reflogs

Fortunately, "abandoned" commits are normally remembered in what Git calls reflogs. Reflogs use an extended syntax, name@{selector}. The selector part is usually either a number or date, such as brB@{1} or brB@{yesterday}. Every time Git updates a branch name to point to some commit, it writes a reflog entry for that branch, with the pointed-to commit's ID, a time-stamp, and an optional message. Run git reflog brB to see these. The git branch -f command wrote the new target as brB@{0}, bumping up all the old numbers, so now brB@{1} names the previous tip commit. So:

$ git branch -f brB 'brB@{1}'
    # you may not need the quotes, 'brB@{...}' --
    # I need them in my shell, otherwise the shell
    # eats the braces.  Some shells do, some don't.
Run Code Online (Sandbox Code Playgroud)

will put it back (and also renumber all the numbers again: each update replaces the old @{0} and makes it @{1}, and @{1} becomes @{2}, and so on).

Anyway, suppose that we do our git checkout -b newbranch while we're on brC, and fail to mention brA. That is, we start with:

          o--o--o--o     <-- brA
         /
...--o--o--o--o--o--o    <-- brB
            \
             o--o--o     <-- HEAD -> brC
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

and run git checkout -b newbranch. Then we get this:

          o--o--o--o     <-- brA
         /
...--o--o--o--o--o--o    <-- brB
            \
             o--o--o     <-- brC, HEAD -> newbranch
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

If we meant to make newbranch point to commit brA, we can in fact do that right now, with git branch -f. But let's say we make a new commit before realizing that we made newbranch start at the wrong point. Let's draw it in:

          o--o--o--o     <-- brA
         /
...--o--o--o--o--o--o    <-- brB
            \
             o--o--o     <-- brC
                 \  \
                 |   o   <-- HEAD -> newbranch
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

If we use git branch -f now, we'll abandon—lose—the commit we just made. What we want instead is to rebase it, onto the commit that branch brA points-to.

A simple git rebase copies too much

What if, instead of using git branch -f, we use git rebase brA? Let's analyze this using—what else—our DAGlets. We start with the above drawing above, with the extended leg going out to brD, though in the end we get to ignore that leg, and with the section going to brB, most of which we also get to ignore. What we don't get to ignore is all that stuff in the middle, that we get by tracing the lines back.

The git rebase command, in this form, will use brA..newbranch to pick commits to copy. So, starting with the whole DAGlet, let's mark (with *) all the commits that are on (or contained in) newbranch:

          o--o--o--o     <-- brA
         /
...--*--*--*--o--o--o    <-- brB
            \
             *--*--*     <-- brC
                 \  \
                 |   *   <-- HEAD -> newbranch
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

Now, let's un-mark (with x) all the commits that are on (or contained in) brA:

          x--x--x--x     <-- brA
         /
...--x--x--*--o--o--o    <-- brB
            \
             *--*--*     <-- brC
                 \  \
                 |   *   <-- HEAD -> newbranch
                 \
                  o--o   <-- brD
Run Code Online (Sandbox Code Playgroud)

Whatever remains—all the * commits—are the ones that git rebase will copy. That's way too many!

We need to get __COD

  • 对于第一条评论:不,因为 `git rebase` 坚持插入 `..HEAD` 部分本身(所以你不能也不应该尝试提供它)。对第二条评论:是的,任何时候你尝试复制提交(使用`git rebase`,`git diff | git apply`,`git format-patch | git am`,`gitcherry-pick`,甚至`git恢复`——毕竟恢复只是“向后应用”),你可能会遇到合并冲突。 (3认同)
  • @Attilio:变基是通过提交而不是分支名称来工作的。像往常一样使用“--onto”选择目标提交,并使用其他参数选择上游限制器。在复制过程结束时,Git 将 *当前* 分支名称移动到最后复制的提交(如果没有复制提交,则移动到 `--onto` 目标)。 (3认同)