我Cronjob在 kubernetes 中使用 schedule( 8 * * * *)创建了一个,作业backoffLimit默认为 6,podRestartPolicy为Never,pod被故意配置为 FAIL。据我了解,(对于 podSpec with restartPolicy : Never)作业控制器将尝试创建backoffLimit数量的 pod,然后将作业标记为Failed,因此,我预计会有 6 个 pod 处于Error状态。
这是实际工作的状态:
status:
conditions:
- lastProbeTime: 2019-02-20T05:11:58Z
lastTransitionTime: 2019-02-20T05:11:58Z
message: Job has reached the specified backoff limit
reason: BackoffLimitExceeded
status: "True"
type: Failed
failed: 5
Run Code Online (Sandbox Code Playgroud)
为什么只有 5 个失败的 Pod 而不是 6 个?还是我的理解backoffLimit不正确?
MWZ*_*MWZ 13
简而言之:您可能看不到所有创建的 pod,因为 cronjob 中的计划周期太短。
如文档中所述:
与 Job 关联的失败 Pod 由 Job 控制器重新创建,其指数退避延迟(10 秒、20 秒、40 秒……)上限为 6 分钟。如果在 Job 的下一次状态检查之前没有出现新的失败 Pod,则会重置回退计数。
如果在作业控制器有机会重新创建 Pod 之前安排了新作业(记住上一次失败后的延迟),作业控制器会再次从 1 开始计数。
我使用以下方法在 GKE 中重现了您的问题.yaml:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hellocron
spec:
schedule: "*/3 * * * *" #Runs every 3 minutes
jobTemplate:
spec:
template:
spec:
containers:
- name: hellocron
image: busybox
args:
- /bin/cat
- /etc/os
restartPolicy: Never
backoffLimit: 6
suspend: false
Run Code Online (Sandbox Code Playgroud)
此作业将失败,因为文件/etc/os不存在。
这是kubectl describe其中一项工作的输出:
Name: hellocron-1551194280
Namespace: default
Selector: controller-uid=b81cdfb8-39d9-11e9-9eb7-42010a9c00d0
Labels: controller-uid=b81cdfb8-39d9-11e9-9eb7-42010a9c00d0
job-name=hellocron-1551194280
Annotations: <none>
Controlled By: CronJob/hellocron
Parallelism: 1
Completions: 1
Start Time: Tue, 26 Feb 2019 16:18:07 +0100
Pods Statuses: 0 Running / 0 Succeeded / 6 Failed
Pod Template:
Labels: controller-uid=b81cdfb8-39d9-11e9-9eb7-42010a9c00d0
job-name=hellocron-1551194280
Containers:
hellocron:
Image: busybox
Port: <none>
Host Port: <none>
Args:
/bin/cat
/etc/os
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-4lf6h
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-85khk
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-wrktb
Normal SuccessfulCreate 26m job-controller Created pod: hellocron-1551194280-6942s
Normal SuccessfulCreate 25m job-controller Created pod: hellocron-1551194280-662zv
Normal SuccessfulCreate 22m job-controller Created pod: hellocron-1551194280-6c6rh
Warning BackoffLimitExceeded 17m job-controller Job has reached the specified backoff limit
Run Code Online (Sandbox Code Playgroud)
请注意创建 podhellocron-1551194280-662zv和hellocron-1551194280-6c6rh.
小智 5
使用spec.backoffLimit考虑工作为失败之前指定的重试次数。默认情况下,回退限制设置为 6。
| 归档时间: |
|
| 查看次数: |
49555 次 |
| 最近记录: |