liv*_*ton 5 kubernetes kubectl kubernetes-statefulset
根据kubectl docs,kubectl rollout restart适用于部署、守护程序集和状态集。它可以按预期进行部署。但是对于 statefulsets,它只重启 2 个 pod 中的一个。
? k rollout restart statefulset alertmanager-main (playground-fdp/monitoring)
statefulset.apps/alertmanager-main restarted
? k rollout status statefulset alertmanager-main (playground-fdp/monitoring)
Waiting for 1 pods to be ready...
Waiting for 1 pods to be ready...
statefulset rolling update complete 2 pods at revision alertmanager-main-59d7ccf598...
? kgp -l app=alertmanager (playground-fdp/monitoring)
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 21h
alertmanager-main-1 2/2 Running 0 20s
Run Code Online (Sandbox Code Playgroud)
如您所见,pod alertmanager-main-1 已重新启动,其年龄为 20 秒。而 statefulset alertmanager 中的另一个 pod,即 pod alertmanager-main-0 还没有重新启动,它的年龄是 21 小时。知道在更新了 statefulset 使用的某些配置映射后如何重新启动它吗?
[更新 1] 这是 statefulset 配置。如您所见,.spec.updateStrategy.rollingUpdate.partition 未设置。
apiVersion: apps/v1
kind: StatefulSet
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"monitoring.coreos.com/v1","kind":"Alertmanager","metadata":{"annotations":{},"labels":{"alertmanager":"main"},"name":"main","namespace":"monitoring"},"spec":{"baseImage":"10.47.2.76:80/alm/alertmanager","nodeSelector":{"kubernetes.io/os":"linux"},"replicas":2,"securityContext":{"fsGroup":2000,"runAsNonRoot":true,"runAsUser":1000},"serviceAccountName":"alertmanager-main","version":"v0.19.0"}}
creationTimestamp: "2019-12-02T07:17:49Z"
generation: 4
labels:
alertmanager: main
name: alertmanager-main
namespace: monitoring
ownerReferences:
- apiVersion: monitoring.coreos.com/v1
blockOwnerDeletion: true
controller: true
kind: Alertmanager
name: main
uid: 3e3bd062-6077-468e-ac51-909b0bce1c32
resourceVersion: "521307"
selfLink: /apis/apps/v1/namespaces/monitoring/statefulsets/alertmanager-main
uid: ed4765bf-395f-4d91-8ec0-4ae23c812a42
spec:
podManagementPolicy: Parallel
replicas: 2
revisionHistoryLimit: 10
selector:
matchLabels:
alertmanager: main
app: alertmanager
serviceName: alertmanager-operated
template:
metadata:
creationTimestamp: null
labels:
alertmanager: main
app: alertmanager
spec:
containers:
- args:
- --config.file=/etc/alertmanager/config/alertmanager.yaml
- --cluster.listen-address=[$(POD_IP)]:9094
- --storage.path=/alertmanager
- --data.retention=120h
- --web.listen-address=:9093
- --web.external-url=http://10.47.0.234
- --web.route-prefix=/
- --cluster.peer=alertmanager-main-0.alertmanager-operated.monitoring.svc:9094
- --cluster.peer=alertmanager-main-1.alertmanager-operated.monitoring.svc:9094
env:
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
image: 10.47.2.76:80/alm/alertmanager:v0.19.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 10
httpGet:
path: /-/healthy
port: web
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 3
name: alertmanager
ports:
- containerPort: 9093
name: web
protocol: TCP
- containerPort: 9094
name: mesh-tcp
protocol: TCP
- containerPort: 9094
name: mesh-udp
protocol: UDP
readinessProbe:
failureThreshold: 10
httpGet:
path: /-/ready
port: web
scheme: HTTP
initialDelaySeconds: 3
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
resources:
requests:
memory: 200Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/alertmanager/config
name: config-volume
- mountPath: /alertmanager
name: alertmanager-main-db
- args:
- -webhook-url=http://localhost:9093/-/reload
- -volume-dir=/etc/alertmanager/config
image: 10.47.2.76:80/alm/configmap-reload:v0.0.1
imagePullPolicy: IfNotPresent
name: config-reloader
resources:
limits:
cpu: 100m
memory: 25Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/alertmanager/config
name: config-volume
readOnly: true
dnsPolicy: ClusterFirst
nodeSelector:
kubernetes.io/os: linux
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccount: alertmanager-main
serviceAccountName: alertmanager-main
terminationGracePeriodSeconds: 120
volumes:
- name: config-volume
secret:
defaultMode: 420
secretName: alertmanager-main
- emptyDir: {}
name: alertmanager-main-db
updateStrategy:
type: RollingUpdate
status:
collisionCount: 0
currentReplicas: 2
currentRevision: alertmanager-main-59d7ccf598
observedGeneration: 4
readyReplicas: 2
replicas: 2
updateRevision: alertmanager-main-59d7ccf598
updatedReplicas: 2
Run Code Online (Sandbox Code Playgroud)
Pjo*_*erS 12
您没有提供整个场景。这可能取决于Readiness Probe或Update Strategy.
StatefulSet从索引重新启动 Pod 0 to n-1。详细信息可以在这里找到。
原因1*
\n\nStatefulset有4种更新策略。
在Partition更新中您可以找到以下信息:
\n\n\n如果指定了分区,则当更新 StatefulSet\xe2\x80\x99s\n 时,所有序数大于或等于该分区的 Pod 都将被更新
\n.spec.template。序数小于分区的所有 Pod 都不会被更新,并且即使删除它们,也会在以前的版本中重新创建。如果 StatefulSet\xe2\x80\x99s\n.spec.updateStrategy.rollingUpdate.partition大于其\n.spec.replicas,则对其的更新.spec.template不会\n 传播到其 Pod。在大多数情况下,您不需要使用分区,但如果您想要暂存更新、推出金丝雀或执行分阶段推出,它们会很有用。
因此,如果您在某个位置StatefulSet进行了设置,updateStrategy.rollingUpdate.partition: 1它将重新启动索引为 1 或更高的所有 pod。
的例子partition: 3
NAME READY STATUS RESTARTS AGE\nweb-0 1/1 Running 0 30m\nweb-1 1/1 Running 0 30m\nweb-2 1/1 Running 0 31m\nweb-3 1/1 Running 0 2m45s\nweb-4 1/1 Running 0 3m\nweb-5 1/1 Running 0 3m13s\nRun Code Online (Sandbox Code Playgroud)\n\n原因2
\n\n的配置Readiness probe。
initialDelaySeconds如果和的值periodSeconds很高,则可能需要一段时间才能重新启动另一个。有关这些参数的详细信息可以在此处找到。
在下面的示例中,pod 将等待 10 秒才能运行,并且readiness probe每 2 秒检查一次。取决于值,它可能是导致此行为的原因。
readinessProbe:\n failureThreshold: 3\n httpGet:\n path: /\n port: 80\n scheme: HTTP\n initialDelaySeconds: 10\n periodSeconds: 2\n successThreshold: 1\n timeoutSeconds: 1\nRun Code Online (Sandbox Code Playgroud)\n\n原因3
\n\n我看到每个 Pod 中有 2 个容器。
\n\nNAME READY STATUS RESTARTS AGE\nalertmanager-main-0 2/2 Running 0 21h\nalertmanager-main-1 2/2 Running 0 20s\nRun Code Online (Sandbox Code Playgroud)\n\n如文档中所述:
\n\n\n\n\n\n
Running- Pod 已绑定到节点,并且所有 Container 已创建。至少有一个容器仍在运行,或者正在启动或重新启动。
最好检查一下两者是否一切正常containers(readinessProbe/livenessProbe、重新启动等)
| 归档时间: |
|
| 查看次数: |
23418 次 |
| 最近记录: |