I have a git repository that, when just checked out, takes around 2.3 GiB even in the shallowest configuration, of which 1.9 GiB is inside .git/objects/pack. The working tree files are just about .5 GiB.
Considering I have a remote from which I can re-fetch all the objects if needed, the question is:
.git everything that I could then re-fetch safely, with simple git commands, from the remote?Testing a bit, I found out that if I delete everything under .git/objects/pack/, it will be re-downloaded from the remote with a simple git fetch.
There are some complaints like:
error: refs/heads/master does not point to a valid object!
error: refs/remotes/origin/master does not point to a valid object!
error: refs/remotes/origin/HEAD does not point to a valid object!
Run Code Online (Sandbox Code Playgroud)
But then .git/objects/pack gets repopulated and further calls to git fetch don't complain anymore.
Is it safe to nuke .git/objects/pack* like this?
Assumptions:
The intent is to reduce as much as possible the amount of space taken by artifacts from a continuous-integration pipeline, but retaining enough information so that a those artifacts could be downloaded and restored to working order in the developer workstation with as little (and as normal) commands as possible.
- 我可以从 .git 内部删除什么(以及如何)删除我可以使用简单的 git 命令从远程安全地重新获取的所有内容?
如果您不想担心.git某些内容的内部结构以及是否可恢复,您可以保存足够的信息以再次检查所有内容,并将工作区恢复到与在 CI 中运行时功能相似的状态管道。
在某处添加这样的文件(我们称之为degit.sh)
#!/bin/bash
set -ex
GIT_REMOTE=$( git remote get-url origin )
GIT_BRANCH=$( git rev-parse --abbrev-ref HEAD )
GIT_COMMIT=$( git rev-parse HEAD )
# TABs, not spaces, indenting the block below:
cat <<-EOF > .gitrestore
set -ex
test ! -e .git
tmpclone=\$( mktemp -d --tmpdir=. )
git clone $GIT_REMOTE -n --branch=$GIT_BRANCH \$tmpclone
( cd \$tmpclone ; git reset --hard $GIT_COMMIT )
mv \$tmpclone/.git .
rm -rf "\$tmpclone"
rm -f \$0
EOF
rm -rf .git
Run Code Online (Sandbox Code Playgroud)
然后,在持续集成 (CI) 工作区的每个 git 存储库的根目录中调用它,以便它生成一个.gitrestore文件。
它看起来像这样:
set -ex
test ! -e .git
tmpclone=$( mktemp -d --tmpdir=. )
git clone git@example.com:example/repo.git -n --branch=example-branch $tmpclone
mv $tmpclone/.git .
git reset --hard example-commit-hash
rm -rf "$tmpclone"
rm -f $0
Run Code Online (Sandbox Code Playgroud)
请注意,它在运行成功后会自毁。您不想运行两次。
现在,您的开发人员可以获取 CI 工件并在每个存储库中运行:
bash .gitrestore
Run Code Online (Sandbox Code Playgroud)
它将拥有一个看起来与 CI 管道非常相似的存储库,除了更新的远程视图之外,它允许开发人员将 CI 拥有的内容与她拥有的内容进行比较。
这假设只有 CI 机器受到空间限制,而开发人员机器(也不是她的带宽)受到限制。
如果您想节省开发人员端的空间/带宽,您可以传递--depth=1,这将仅克隆指定的分支(即,它意味着--single-branch并将历史记录限制为单个提交。
| 归档时间: |
|
| 查看次数: |
1034 次 |
| 最近记录: |