K8s控制器更新状态和条件

Question

K8s控制器更新状态和条件

bre*_*uts 6 azure go kubernetes

我有 k8s 控制器，需要安装一些资源并相应地更新状态和条件

\n

协调的流程如下：

\n

安装资源并不要\xe2\x80\x99t等待
如果准备好/待安装/错误，则调用该函数checkAvailability并相应更新状态

\n

我\xe2\x80\x99有两个主要问题：

\n

这是我第一次使用状态和条件，这是正确的方式还是我错过了什么
有时，当我进行更新时，r.Status().Update我收到错误：Operation cannot be fulfilled on eds.core.vtw.bmw.com \xe2\x80\x9cresouce01\xe2\x80\x9d: the object has been modified; please apply your changes to the latest version and try again , so I\xe2\x80\x99ve added the check conditionChanged`，它解决了问题，但不确定它是否正确，因为我更新了一次状态，如果它不\xe2\x80\x99t高喊，我不\xe2\x80\x99t触摸它，所以用户可以看到不久前的就绪状态，并且协调不会更新就绪条件的日期和时间，因为它在已经 \xe2\x80\x9cready\xe2\x80\x9d 时跳过它

\n

我用以下

\n

func (r *ebdReconciler) checkHealth(ctx context.Context, req ctrl.Request, ebd ebdmanv1alpha1.ebd) (bool, error) {\n    vfmReady, err := r.mbr.IsReady(ctx, req.Name, req.Namespace)\n    condition := metav1.Condition{\n        Type:               ebdmanv1alpha1.KubernetesvfmHealthy,\n        Observebdneration: ebd.Generation,\n        LastTransitionTime: metav1.Now(),\n    }\n    if err != nil {\n        // There was an error checking readiness - Set status to false\n        condition.Status = metav1.ConditionFalse\n        condition.Reason = ebdmanv1alpha1.ReasonError\n        condition.Message = fmt.Sprintf("Failed to check  vfm readiness: %v", err)\n    } else if vfmReady {\n        // The vfm is ready - Set status to true\n        condition.Status = metav1.ConditionTrue\n        condition.Reason = ebdmanv1alpha1.ReasonReady\n        condition.Message = "vfm custom resource is ready"\n    } else {\n        // The vfm is not ready - Set status to false\n        condition.Status = metav1.ConditionFalse\n        condition.Reason = ebdmanv1alpha1.ResourceProgressing\n        condition.Message = "vfm custom resource is not ready"\n    }\n    // Get the latest version of the ebd\n    latestebd := ebdmanv1alpha1.ebd{}\n    if err := r.Get(ctx, req.NamespacedName, &latestebd); err != nil {\n        return vfmReady, err\n    }\n\n    oldConditions := latestebd.Status.Conditions\n    meta.SetStatusCondition(&latestebd.Status.Conditions, condition)\n\n    if !conditionChanged(&oldConditions, &latestebd.Status.Conditions, ebdmanv1alpha1.KubernetesvfmHealthy) {\n        return vfmReady, nil\n    }\n\n    if err := r.Status().Update(ctx, &latestebd); err != nil {\n        r.Log.Error(err, "failed to update vfm status")\n        return vfmReady, err\n    }\n    return vfmReady, nil\n}\n\n\nfunc conditionChanged(oldConditions, newConditions *[]metav1.Condition, conditionType string) bool {\n    newCondition := meta.FindStatusCondition(*newConditions, conditionType)\n    oldCondition := meta.FindStatusCondition(*oldConditions, conditionType)\n    if oldCondition == nil && newCondition == nil {\n        return false\n    }\n    if oldCondition == nil || newCondition == nil {\n        return true\n    }\n    return oldCondition.Status != newCondition.Status || oldCondition.Reason != newCondition.Reason || oldCondition.Message != newCondition.Message\n}\n

Run Code Online (Sandbox Code Playgroud)\n

Answer 1

Von*_*onC 3

对于您的问题：

\n

这是我第一次使用状态和条件，这是正确的方法还是我错过了什么？
\n
您管理 Kubernetes 资源状态和条件的方法通常很好。Kubernetes API 对象中的子status资源通常用于表示系统的当前状态，并且它可以包含条件。
\n
条件是字段的集合，它以比true或更详细的方式描述对象的状态false。每个条件通常都有type、status、reason、message和lastTransitionTime。您的代码根据vfm自定义资源是否准备好正确设置这些字段。
\n
值得注意的是，条件应该是水平的 - 这意味着它们应该设置为当前观察到的值，无论其先前的值如何。还应该为组件当前状态的所有重要或用户有意义的方面设置（true、false或）它们。unknown这使得条件成为指示“瞬态状态”的良好机制，例如Progressing可能Degraded会随着时间的推移或基于外部状态而变化的状态。
\n
有时，当我进行更新时，r.Status().Update我会收到错误：Operation cannot be fulfilled on eds.core.vtw.bmw.com \xe2\x80\x9cresource01\xe2\x80\x9d: the object has been modified; please apply your changes to the latest version and try again。
\n
发生此错误的原因是，当您的控制器正在处理同一对象时，另一个客户端更新了该对象。这可能是另一个控制器，甚至是同一控制器的另一个实例（如果您运行多个控制器）。
\n
处理此问题的一种可能方法是使用重试机制，在发生此错误时重新尝试状态更新。就您而言，您已经实施了一项conditionChanged检查，仅在条件发生变化时才尝试状态更新。这是避免不必要的更新的好方法，但它并不能完全防止错误，因为另一个客户端仍然可以在您的Get调用和Status().Update调用之间更新对象。
\n
您还可以考虑使用Patch而不是Update修改状态，这可以降低与其他更新冲突的风险。修补允许对对象进行部分更新，因此您不太可能遇到冲突。
\n
关于时间问题，您可以考虑仅LastTransitionTime在状态实际发生变化时更新，而不是每次完成健康检查时更新。这意味着LastTransitionTime反映状态上次更改的时间，而不是上次执行检查的时间。
\n
需要记住的一件事是，频繁更新状态子资源，即使状态没有改变，也可能会导致不必要的 API 服务器负载。您应该努力仅在状态发生变化时才更新状态。
\n

\n

考虑到这些点，您的函数的可能更新版本checkHealth可能是：

\n

func (r *ebdReconciler) checkHealth(ctx context.Context, req ctrl.Request, ebd ebdmanv1alpha1.ebd) (bool, error) {\n    vfmReady, err := r.mbr.IsReady(ctx, req.Name, req.Namespace)\n    condition := metav1.Condition{\n        Type:   ebdmanv1alpha1.KubernetesvfmHealthy,\n        Status: metav1.ConditionUnknown, // start with unknown status\n    }\n\n    latestebd := ebdmanv1alpha1.ebd{}\n    if err := r.Get(ctx, req.NamespacedName, &latestebd); err != nil {\n        return vfmReady, err\n    }\n    oldCondition := meta.FindStatusCondition(latestebd.Status.Conditions, ebdmanv1alpha1.KubernetesvfmHealthy)\n\n    if err != nil {\n        // There was an error checking readiness - Set status to false\n        condition.Status = metav1.ConditionFalse\n        condition.Reason = ebdmanv1alpha1.ReasonError\n        condition.Message = fmt.Sprintf("Failed to check  vfm readiness: %v", err)\n    } else if vfmReady {\n        // The vfm is ready - Set status to true\n        condition.Status = metav1.ConditionTrue\n        condition.Reason = ebdmanv1alpha1.ReasonReady\n        condition.Message = "vfm custom resource is ready"\n    } else {\n        // The vfm is not ready - Set status to false\n        condition.Status = metav1.ConditionFalse\n        condition.Reason = ebdmanv1alpha1.ResourceProgressing\n        condition.Message = "vfm custom resource is not ready"\n    }\n\n    // Only update the LastTransitionTime if the status has changed\n    if oldCondition == nil || oldCondition.Status != condition.Status {\n        condition.LastTransitionTime = metav1.Now()\n    } else {\n        condition.LastTransitionTime = oldCondition.LastTransitionTime\n    }\n\n    meta.SetStatusCondition(&latestebd.Status.Conditions, condition)\n\n    if oldCondition != nil && condition.Status == oldCondition.Status && condition.Reason == oldCondition.Reason && condition.Message == oldCondition.Message {\n        return vfmReady, nil\n    }\n\n    // Retry on conflict\n    retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {\n        // Retrieve the latest version of ebd before attempting update\n        // RetryOnConflict uses exponential backoff to avoid exhausting the apiserver\n        if getErr := r.Get(ctx, req.NamespacedName, &latestebd); getErr != nil {\n            return getErr\n        }\n        if updateErr := r.Status().Update(ctx, &latestebd); updateErr != nil {\n            return updateErr\n        }\n        return nil\n    })\n\n    if retryErr != nil {\n        r.Log.Error(retryErr, "Failed to update vfm status after retries")\n        return vfmReady, retryErr\n    }\n\n    return vfmReady, nil\n}\n

Run Code Online (Sandbox Code Playgroud)\n

在此更新版本中：

\n

LastTransitionTime仅当条件的状态发生变化时，该字段才会更新。这将确保LastTransitionTime准确反映状态上次更改的时间，而不是checkHealth函数上次运行的时间。这应该提供更准确的资源状态实际更改时间的时间表，而不是运行协调循环的时间。
\n
添加重试机制，用于retry.RetryOnConflict在发生冲突错误时重新尝试状态更新。请注意，您需要为此导入“ k8s.io/client-go/util/retry”包。
\n这是处理错误的常见模式Operation cannot be fulfilled...。
\n

\n

这些更改应该有助于解决您在更新 Kubernetes 资源的状态和条件时所面临的问题。
\n请记住，您仍可能偶尔遇到冲突错误，特别是当有其他客户端更新同一对象时。在这些情况下，该RetryOnConflict函数将使用对象的最新版本重试更新。

\n

归档时间：	2 年，9 月前
查看次数：	1312 次
最近记录：	2 年，9 月前