acm*_*acm 2 sigterm kubernetes
我在有状态 Pod 资源中定义了一个 preStop 钩子,该钩子运行 bash 脚本,以确保在应用程序中少数进程完成/取消/错误之前不会终止 Pod。我没有定义终止GracePeriodSeconds。现在,当我删除 pod 时,我测试了 preStop 挂钩中的脚本是否按预期运行。但添加终止GracePeriodSeconds 10 分钟后,首先 bash 脚本作为 preStop 挂钩的一部分成功运行几分钟,并且应该会杀死 pod。但 Pod 处于 TERMINATING 状态,并且仅在 10 分钟后就被杀死。
如何解决这个问题。有没有办法将 SIGTERM 或 SIGKILL 发送到 pod。有任何想法吗?先感谢您!
状态集.YAML
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: appx
name: appx
spec:
serviceName: appx
replicas: 1
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: appx
template:
metadata:
labels:
app: appx
spec:
#removed some of the sensitive info
terminationGracePeriodSeconds: 600
containers:
- image: appx
imagePullPolicy: IfNotPresent
name: appx
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 30 && bash /var/tmp/runscript.sh; sleep10"]
Run Code Online (Sandbox Code Playgroud)
KUBECTL 描述 Pod
**kubectl describe pod appx**
Name: appx
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.3
Start Time: Mon, 21 Sep 2020 07:30:55 -0500
Labels: app=appx
Annotations: <none>
Status: Running
IP: x.x.x.x
Controlled By: StatefulSet/appx
Containers:
appx:
Container ID: docker://dfdgfgfgfgfgfgfg
Image: appx
Image ID: docker://sha256:49dfgfgfgfgfgfgfgfgfg96a6fc
Port: <none>
Host Port: <none>
State: Running
Started: Mon, 21 Sep 2020 07:30:56 -0500
Ready: True
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
appx-token-xj6q9:
Type: Secret (a volume populated by a Secret)
SecretName: appx-token-fhfdlf
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m43s default-scheduler Successfully assigned default/appx to docker-desktop
Normal Pulled 2m42s kubelet, docker-desktop Container image "appx" already present on machine
Normal Created 2m42s kubelet, docker-desktop Created container appx
Normal Started 2m42s kubelet, docker-desktop Started container appx
Run Code Online (Sandbox Code Playgroud)
preStop钩子和 terminationGracePeriodSeconds are asynchronous. It means that as soon as the kubelet sees that a Pod has been marked as terminating, the kubelet begins the local Pod shutdown process. This means that if container doesn`t terminate within the grace period, a SIGKILL singal will be sent and the container will be killed regardless of wehther the commands in the preStop hook are completed.
\n\n\n
\n- 当未添加 TerminationGracePeriodSeconds 时,流程会按预期工作,在脚本完成后\或在 30 秒(即 TerminationGracePeriodSeconds)内终止 pod。但是当我添加 10 分钟或更长的宽限期时,它会等到那个时间然后杀死 pod。
\n
terminationGracePeriodSeconds总是添加宽限期。正如我在评论中提到的,它只是默认为 30 秒。那么,如果terminationGracePeriodSeconds is less than the time to complete the preStop hook?
然后容器将在 结束时终止,terminationGracePeriodSeconds并且 preStop 挂钩将不会完成/运行。
当终止GracePeriodSeconds 设置为 600 秒时,preStop 挂钩脚本将挂起(目前不清楚它是否曾经工作过,因为由于抢先终止,它没有使用默认的 30 秒终止GracePeriodSeconds 进行正确测试)。这意味着某些进程未正确处理 SIGTERM,目前尚未在 preStop 挂钩中进行更正,这意味着容器正在等待 10 分钟终止GracePeriod 结束后发送 SIGKILL。
\n如果你看这里 you will find out that even though the user specified a preStop hook, they needed to SIGTERM nginx for a graceful shutdown.
\n在您设置为 10 分钟的情况下terminationGracePeriodSeconds,即使您的 preStop 挂钩成功执行,Kubernetes 也会等待 10 分钟才终止您的容器,因为这正是您告诉他要做的。终止信号由 kubelet 发送,但没有传递到容器内的应用程序。最常见的原因是,当您的容器运行一个运行应用程序进程的 shell 时,信号可能会被 shell 本身消耗/中断,而不是传递给子进程。另外,由于尚不清楚您的runscript.sh is doing it is difficult to make any other suggestions to what processes are failing to handle SIGTERM.
在这种情况下你能做什么?提前结束的选项有:
\n| 归档时间: |
|
| 查看次数: |
3645 次 |
| 最近记录: |