CI 作业在 GitLab-runner 上间歇性失败,并出现错误“致命:浅文件自我们读取以来已更改”

Jea*_*one 6 git gitlab nixos gitlab-ci gitlab-ci-runner

我的自托管 GitLab 部署上的作业最近有时开始失败,出现以下 git 错误:

完整作业日志的示例如下:

Running with gitlab-runner 13.12.0 (v13.12.0)
  on ....50ab V...
Preparing the "shell" executor
00:00
Using Shell executor...
Preparing environment
00:00
Running on saxtons...
Getting source from Git repository
00:03
$ /nix/store/s0frm5z2k43qm66q39ifl2vz96hmyxg4-pre-clone
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /var/lib/private/gitlab-runner/builds/V.../2/privatestorage/PrivateStorageio/.git/
fatal: shallow file has changed since we read it
Cleaning up file based variables
00:00
ERROR: Job failed: exit status 1
Run Code Online (Sandbox Code Playgroud)

预克隆脚本包含此内容,用于修复不可写目录的权限,这些目录会导致运行程序尝试清理 git checkout 失败:

chmod --recursive u+rwX .
Run Code Online (Sandbox Code Playgroud)

GitLab-runner 的 config.toml 包含以下内容:

check_interval = 0
concurrent = 8

[[runners]]
executor = "shell"
name = "...50ab"
pre_clone_script = "/nix/store/s0frm5z2k43qm66q39ifl2vz96hmyxg4-pre-clone"
token = "V..."
url = "https://.../"

[runners.cache]
[runners.cache.azure]

[runners.cache.gcs]

[runners.cache.s3]

[runners.custom_build_dir]

[[runners]]
executor = "docker"
name = "...5afc"
token = "..."
url = "https://.../"

[runners.cache]
[runners.cache.azure]

[runners.cache.gcs]

[runners.cache.s3]

[runners.custom_build_dir]

[runners.docker]
disable_cache = false
disable_entrypoint_overwrite = false
image = "nixos/nix"
oom_kill_disable = false
privileged = false
shm_size = 0
tls_verify = false
volumes = ["/cache"]

[[runners]]
executor = "docker"
name = "...c334"
token = "..."
url = "https://.../"

[runners.cache]
[runners.cache.azure]

[runners.cache.gcs]

[runners.cache.s3]

[runners.custom_build_dir]

[runners.docker]
disable_cache = false
disable_entrypoint_overwrite = false
image = ".../ubuntu-python3-awscli"
oom_kill_disable = false
privileged = false
shm_size = 0
tls_verify = false
volumes = ["/cache"]

[session_server]
session_timeout = 1800
Run Code Online (Sandbox Code Playgroud)

GitLab-runner 部署在 NixOS 21.05 上(使用 NixOS 包/服务配置)。

我以前从未见过这个 git 错误。

  • 这表明正在发生什么?
  • 如何配置 GitLab 以停止执行任何导致此问题的操作?

小智 4

长话短说:

您的构建目录对于生成的构建应该是唯一的。

// .gitlab-ci.yml --> Add as a global config option
variables:
     GIT_CLONE_PATH: '$CI_BUILDS_DIR/$CI_PROJECT_NAME/$CI_PIPELINE_ID'


// Add to gitlab-runner config.toml
[[runners]]
  pre_clone_script = "rm -f /builds/*/*/.git/shallow.lock"
[runners.custom_build_dir]
  enabled = true
Run Code Online (Sandbox Code Playgroud)

原因:我在同一主机上 设置了多个 docker gitlab-runner 。

与 Docker 执行器一起运行的并发管道正在访问相同的构建目录:

/build/PROJECT_NAME/REPO/.git/

他们会覆盖目录内容。在克隆状态期间取消作业也会留下一个shallow.lock文件。