helm 错误:发布需要回滚才能升级

Efr*_*tan 5 kubernetes kubernetes-helm fluxcd

在我的集群中,我使用weave Flux及其 Flux-helm-operator 以 gitops 方式管理我的集群。

然而,当我对 Flux git 存储库中的图表进行更改时,我经常遇到以下错误消息:

ts=2019-09-25T11:54:37.604506452Z caller=chartsync.go:328 component=chartsync
warning="unable to proceed with release" 
resource=mychart:helmrelease/mychart release=mychart
err="release requires a rollback before it can be upgraded (FAILED)"
Run Code Online (Sandbox Code Playgroud)

我不确定这在 helm 中意味着什么,但无论如何,我不应该运行任何 helm 命令,因为版本是由 Flux 管理的,所以我想知道在生产中处理此错误的正确方法是什么

(除了删除版本并等待 Flux 重新创建它之外)

一个解释清楚的答案将被非常接受,谢谢。

Yas*_*sen 5

让我们深入研究一下代码helm-operator

unable to proceed with release之后出现警告GetUpgradableRelease

    // GetUpgradableRelease returns a release if the current state of it
    // allows an upgrade, a descriptive error if it is not allowed, or
    // nil if the release does not exist.
Run Code Online (Sandbox Code Playgroud)

release requires a rollback before it can be upgraded如果release有状态则返回错误Status_FAILED(参见release.go#89

UNHEALTHY状态块释放

正如flux开发人员在 #2265 中提到的,没有办法滚动到UNHEALTHY状态。

这不是一个错误,但我可以看到你的期望来自哪里。

Flux 只会向前推进健康的版本,原因之一是确保我们不会陷入失败的循环,--force因此该标志无意用于强制升级不健康的资源(您应该使用回滚功能),但开发的目的是可以通过向后不兼容的更改来升级图表(例如,不可变字段的更改,需要首先删除资源,请参阅#1760)。

结论:forceUpgrade是荣幸的,但不能用于强制升级某个状态下的版本UNHEALTHY

回滚

正如作者建议的那样,您应该使用rollback功能

Helm 操作员所做的发布有时可能会失败,可以通过.spec.rollback.enable在 HelmRelease 资源上设置为 true 来自动回滚失败的发布。

Note: a successful rollback of a Helm chart containing a StatefulSet resource is known to be tricky, and one of the main reasons automated rollbacks are not enabled by default for all HelmReleases. Verify a manual rollback of your Helm chart does not cause any problems before enabling it.
Run Code Online (Sandbox Code Playgroud)

启用后,Helm 操作员将检测到错误的升级并执行回滚,除非检测到值和/或图表的更改,否则它不会尝试新的升级。

apiVersion: flux.weave.works/v1beta1
kind: HelmRelease
# metadata: ...
spec:
  # Listed values are the defaults.
  rollback:
    # If set, will perform rollbacks for this release.
    enable: false
    # If set, will force resource update through delete/recreate if
    # needed.
    force: false
    # Prevent hooks from running during rollback.
    disableHooks: false
    # Time in seconds to wait for any individual Kubernetes operation.
    timeout: 300
    # If set, will wait until all Pods, PVCs, Services, and minimum
    # number of Pods of a Deployment are in a ready state before
    # marking the release as successful. It will wait for as long
    # as the set timeout.
    wait: false
Run Code Online (Sandbox Code Playgroud)