Prometheus kube_pod_container_status_waiting_reason 未捕获 pod CrashLoopBackOff 原因

Question

Prometheus kube_pod_container_status_waiting_reason 未捕获 pod CrashLoopBackOff 原因

Pre*_*i V 3 kubernetes prometheus prometheus-operator prometheus-alertmanager kube-state-metrics

根据定义，kube_pod_container_status_waiting_reason应该捕获处于 Waiting 状态的 pod 的原因。

我的 kubernetes 集群中有几个 Pod 位于 CrashLoopBackOff 中，但我没有看到kube_pod_container_status_waiting_reason. 它只捕获两个原因 - ErrImagePull 和 ContainerCreating。

~$ k get pods -o wide --show-all --all-namespaces | grep Crash
cattle-system   cattle-cluster-agent-6f744c67cc-jlkjh       0/1       CrashLoopBackOff   2885       10d       10.233.121.247   k8s-4
cattle-system   cattle-node-agent-6klkh                     0/1       CrashLoopBackOff   2886       171d      10.171.201.127   k8s-2
cattle-system   cattle-node-agent-j6r94                     0/1       CrashLoopBackOff   2887       171d      10.171.201.110   k8s-3
cattle-system   cattle-node-agent-nkfcq                     0/1       CrashLoopBackOff   17775      171d      10.171.201.131   k8s-1
cattle-system   cattle-node-agent-np76b                     0/1       CrashLoopBackOff   2887       171d      10.171.201.89    k8s-4
cattle-system   cattle-node-agent-pwn5v                     0/1       CrashLoopBackOff   2859       171d      10.171.202.72    k8s-5

Run Code Online (Sandbox Code Playgroud)

sum by (reason) (kube_pod_container_status_waiting_reason)在 prometheus 中运行会产生结果：

Element                       Value
{reason="ContainerCreating"}    0
{reason="ErrImagePull"}         0

Run Code Online (Sandbox Code Playgroud)

我正在运行quay.io/coreos/kube-state-metrics:v1.2.0kube-state-metrics 的图像。

我错过了什么？为什么 CrashLoopBackOff 原因没有显示在查询中？我想设置一个警报，以查找处于等待状态的 Pod 并说明原因。所以想合并kube_pod_container_status_waiting找到等待状态的pod，并kube_pod_container_status_waiting_reason找出确切原因。

请协助。谢谢！

Answer 1

Ric*_*ico 5

你正在遇到这个。基本上，您似乎在使用kube-state-metrics 1.2.0或更早版本。你看到了ImagePullBackOff，CrashLoopBackOff并被添加到1.3.0.

因此，将您的图像更新为：

k8s.gcr.io/kube-state-metrics:v1.3.0
quay.io/coreos/kube-state-metrics:v1.3.0

Run Code Online (Sandbox Code Playgroud)

或者

k8s.gcr.io/kube-state-metrics:v1.4.0
quay.io/coreos/kube-state-metrics:v1.4.0

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年前
查看次数：	1493 次
最近记录：	7 年前