假设您在git存储库中有两个文件,分别是A.txt和B.txt。
是否可以将两个文件合并为第三个文件A+B.txt,删除原始文件A.txt并B.txt全部提交,因此历史记录仍保留下来?
也就是说,如果我问git log --follow A+B.txt我是否知道内容源自A.txt和B.txt文件?
我试图将文件分成两个不同的分支,然后将它们合并为一个新文件(同时删除旧文件),但无济于事。
答案是肯定的!
完全归功于 Raymond Chen 的文章将两个文件合二为一,同时保留行历史:
假设你有两个文件:fruits&veggies
The naïve way of combining the files would be to do it in a single commit, but you'll lose line history on one of the files (or both)
You could tweak the
git blamealgorithms with options like-Mand-Cto get it to try harder, but in practice, you don’t often have control over those options (eg. the git blame may be performed on a server)The trick is to use a
mergewith two forked branches
- In one branch, we rename
veggiestoproduce.- In the other branch, we rename
fruitstoproduce.Run Code Online (Sandbox Code Playgroud)git checkout -b rename-veggies git mv veggies produce git commit -m "rename veggies to produce"Run Code Online (Sandbox Code Playgroud)git checkout - git mv fruits produce git commit -m "rename fruits to produce"Then merge the first into the second
Run Code Online (Sandbox Code Playgroud)git merge -m "combine fruits and veggies" rename-veggiesThis will generate a merge conflict - that's okay - now take the changes from each branch's Produce file and combine into one - here's a simple concatenation (but resolve the merge conflict however you please):
Run Code Online (Sandbox Code Playgroud)cat "produce~HEAD" "produce~rename-veggies" >produce git add produce git merge --continueThe resulting
producefile was created by a merge, so git knows to look in both parents of the merge to learn what happened.And that’s where it sees that each parent contributed half of the file, and it also sees that the files in each branch were themselves created via renames of other files, so it can chase the history back into both of the original files.
Each line should be correctly attributed to the person who introduced it in the original file, whether it’s fruits or veggies. People investigating the produce file get a more accurate history of who last touched each line of the file.
For best results, your rename commit should be a pure rename. Resist the temptation to edit the file’s contents at the same time you rename it. A pure rename ensure that git’s rename detection will find the match. If you edit the file in the same commit as the rename, then whether the rename is detected as such will depend on git’s “similar files” heuristic.
Checkout the full article for a full step by step breakdown and more explanations
Originally, I had thought this might be a use case for git merge-file doing something like this:
>produce echo #empty
git merge-file fruits produce veggies --union -p > produce
git rm fruits veggies
git add produce
git commit -m "combine fruits and veggies"
Run Code Online (Sandbox Code Playgroud)
However, all this does is help simulate the merge diffing algorithm against two different files - the end output when committed is identical to if the file had been updated manually and the resulting changes manually committed
简短的答案是“不”(或者甚至是Mu)。(但是,对于一种方式来获得有用的合成路线的历史为经联合文件git blame,见KyleMit的答案。)
在Git中,历史就是提交的集合。没有“文件历史记录”之类的东西:您有提交或没有提交,并且该提交有一个或多个父母,也没有。这意味着“文件历史记录”作为一个事物不存在,但仍然git log --follow存在。这是自相矛盾的:git log --follow如果文件历史不存在,如何产生文件历史?
答案就是git log --follow作弊。它实际上并没有找到文件历史记录。它浏览历史记录,并通过更改要查找的文件的(单个)名称来构建子历史记录。它一次查看一次提交,然后git diff --find-renames针对其父级运行(加速,受限)该提交。1 如果diff表示X.txt父文件中的文件已重命名为A.txt子文件中的文件,并且您正在运行git log --follow A.txt,则git log现在中的代码开始寻找X.txt。
由于没有代码可以一次开始查找多个文件,因此您无法获得这种特殊的欺骗手段来适应所需的情况,即从查找一个特定文件到一个以上的文件。(这里实际上存在两个问题。一个是,由于内部实现方式相当有限,所以2 git log --follow一次只能查看一个文件。另一个是重命名检测不包括“组合检测”:有一种形式启用了--find-copies和的“拆分检测”功能,Git会在其中进行复制查找,--find-copies-harder后者非常耗费计算资源,并且两者都在错误的方向上工作,尽管可以通过反转方向来正确地进行操作。差异的顺序。)
1如此暗示,--follow至少在默认情况下,根本不查看合并差异。另见`git log –follow –graph`跳过提交。
2又名“俗气的黑客”