我发现如果升级后的 helm hook 失败,它会重试 5 次然后放弃。如何确保钩子只尝试一次成功并在失败时放弃?或者我可以让 helm hook 仅在失败时在特定条件下重试,而不是总是重试?我在这里找不到此用例的任何文档/参数。
您可以设置 Pod 退避失败策略。
据 k8s 文档:Pod 退避失败策略:在某些情况下,由于配置等中的逻辑错误,您希望在一定次数的重试后使作业失败。为此,请在.spec.backoffLimit考虑作业之前设置为指定重试次数失败了。退避限制默认设置为 6。
添加backoffLimit: 1pod 规范,例如:
spec:
backoffLimit: 1
template:
spec:
containers:
- name:
image:
restartPolicy: Never
Run Code Online (Sandbox Code Playgroud)
完整示例:
apiVersion: batch/v1
kind: Job
metadata:
name: {{ .Release.Name | quote }}
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
annotations:
# This is what defines this resource as a hook. Without this line, the
# job is considered part of the release.
"helm.sh/hook": post-upgrade
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": hook-succeeded
spec:
backoffLimit: 1 # here
template:
metadata:
name: {{ .Release.Name | quote }}
labels:
app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
app.kubernetes.io/instance: {{ .Release.Name | quote }}
spec:
restartPolicy: Never
containers:
- name: post-install-job
image: "alpine:3.3"
command: ["/bin/sleep","{{ default "10" .Values.hook.job.sleepyTime }}"]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3193 次 |
| 最近记录: |