标签: readinessprobe

活动和就绪探针连接被拒绝

当我尝试为 awx_web 容器设置活性和就绪性问题时，我不断收到此错误

Liveness probe failed: Get http://POD_IP:8052/: dial tcp POD_IP:8052: connect: connection refused

Run Code Online (Sandbox Code Playgroud)

我的容器 awx_web 部署中的“Liveness & Readiness”部分

          ports:
          - name: http
            containerPort: 8052 # the port of the container awx_web
            protocol: TCP
          livenessProbe:
            httpGet:
              path: /
              port: 8052
            initialDelaySeconds: 5
            periodSeconds: 5
          readinessProbe:
            httpGet:
              path: /
              port: 8052
            initialDelaySeconds: 5
            periodSeconds: 5

Run Code Online (Sandbox Code Playgroud)

如果我测试端口 8052 是否从与包含容器 awx_web 的 pod 位于同一命名空间的另一个 pod 中打开，或者如果我使用部署在与容器 awx_web 相同的 pod 中的容器进行测试，我会得到此消息（端口已打开）

/ # nc -vz POD_IP 8052
POD_IP  (POD_IP :8052) open

Run Code Online (Sandbox Code Playgroud)

如果我从部署了包含容器 awx_web 的 …

http ansible-awx kubernetes readinessprobe livenessprobe

Abd*_*ane

2022 11-13

9
推荐指数

3
解决办法

7万
查看次数

Kubernetes 健康检查：timeoutSeconds 超过 periodSeconds

在 Kubernetes 中Kubernetes Health Check Probes，如果timeoutSeconds超过会发生什么periodSeconds？例如：

initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 10
successThreshold: 1
failureThreshold: 3

Run Code Online (Sandbox Code Playgroud)

Pod 什么时候会“失败”？

initialDelaySeconds+ ( periodSeconds* failureThreshold); 或者
initialDelaySeconds+ ( MAX( periodSeconds, timeoutSeconds) * failureThreshold);

当 Pod 成功时，同样的问题也适用。

kubernetes kubernetes-health-check readinessprobe livenessprobe

h q*_*h q

lucky-day

8
推荐指数

1
解决办法

3836
查看次数

即使端点正在工作，k8s 就绪性和活性探测也会失败

我有一个 Next.js 应用程序，它有 2 个简单端点readiness，liveness具有以下实现：

return res.status(200).send('OK');

Run Code Online (Sandbox Code Playgroud)

我已经根据api 路由文档创建了端点。/stats另外，我根据此处的文档有一个basePath 。因此，探测端点位于/stats/api/readiness和/stats/api/liveness。

当我在本地 Docker 容器中构建并运行应用程序时，探测端点可访问并返回 200 OK。

但是，当我将应用程序部署到 k8s 集群时，探测失败。时间还很充裕initialDelaySeconds，所以这不是原因。

我通过连接到servicepod port-forward，当 pod 刚刚启动时，在失败之前，我可以到达端点并返回 200 OK。过了一会儿，它开始像往常一样失败。

我还尝试通过健康的 pod 访问失败的 pod：

k exec -t [healthy pod name] -- curl -l 10.133.2.35:8080/stats/api/readiness

Run Code Online (Sandbox Code Playgroud)

同样的情况 - 一开始，虽然 pod 尚未失败，但我在curl 命令上得到 200 OK。过了一会儿，它开始失败。

我得到的探针错误是：

Readiness probe failed: Get http://10.133.2.35:8080/stats/api/readiness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

Run Code Online (Sandbox Code Playgroud)

有趣的实验 - …

kubernetes google-kubernetes-engine readinessprobe livenessprobe

Mil*_*iez

2021 06-17

7
推荐指数

1
解决办法

4700
查看次数

在就绪探测失败后是否重试了对 Pod 的探测

readinessProbe：指示容器是否准备好响应请求。如果就绪探测失败，端点控制器会从与 Pod 匹配的所有服务的端点中删除 Pod 的 IP 地址。初始延迟之前的默认就绪状态为失败。如果容器不提供就绪探测，则默认状态为成功

如果就绪探测失败（并且从端点删除了 Pod 的 IP 地址），接下来会发生什么？会再次检查 Pod 的就绪探测条件吗？它会在初始延迟后再次检查吗？Pod 的 IP 地址是否有可能再次添加到端点（如果 Pod 在就绪探测失败后自我修复）？如果 Pod 被治愈，它会再次收到流量吗？

self-healing kubernetes google-kubernetes-engine kubernetes-pod readinessprobe

Use*_*678

2021 06-14

6
推荐指数

1
解决办法

715
查看次数

如何使用 Spring Actuator 配置 Kubernetes 启动探针

我已经阅读了一些文档，并弄清楚了如何使用 Actuator 设置就绪和活跃端点，就像这个一样。但我无法弄清楚如何设置“启动”探针的端点。

我的应用程序yml：

management:
  endpoints:
    web:
      exposure:
        include: "*"
  endpoint:
    health:
      show-details: "ALWAYS"
      group:
        readiness.include: readinessProbe, dataStream
        startup.include: readinessProbe, dataStream

Run Code Online (Sandbox Code Playgroud)

我的部署配置：

  livenessProbe:
    httpGet:
      path: "/actuator/health/liveness"
      port: "http"
    initialDelaySeconds: 600
    periodSeconds: 15
  readinessProbe:
    httpGet:
      path: "/actuator/health/readiness"
      port: "http"
    periodSeconds: 30
    failureThreshold: 15
  startupProbe:
    httpGet:
      path: "/actuator/health/startup"
      port: "http"
    initialDelaySeconds: 150
    periodSeconds: 10
    failureThreshold: 30

Run Code Online (Sandbox Code Playgroud)

执行器似乎没有提供“启动”探针的 URL，或者换句话说，http://localhost:8080/actuator/health/startup 不起作用。我该如何设置？

startup spring-boot kubernetes spring-boot-actuator readinessprobe

ris*_*hav

lucky-day

6
推荐指数

1
解决办法

8478
查看次数

Pod 中多个容器的活性和就绪性探测

我想知道是否有可能对 Pod 中的多个容器或仅对 Pod 中的一个容器应用活性和就绪性探测检查。我确实尝试检查多个容器，但容器 A 的探测检查失败，而容器 B 中的探测检查通过。

kubernetes kubernetes-health-check readinessprobe livenessprobe

san*_*ndy

lucky-day

6
推荐指数

2
解决办法

7436
查看次数

即使 Kubernetes 就绪性探测失败，Pod 也会收到流量

我有一个应用程序，它负责 REST 请求，并且正在侦听 Kafka 主题。我将应用程序部署到 Kubernetes 并像这样配置就绪探针

readinessProbe:
  exec:
    command:
    - cat
    - /tmp/healthy
  initialDelaySeconds: 5
  periodSeconds: 5

Run Code Online (Sandbox Code Playgroud)

基本上遵循[configure-liveness-readiness-startup-probes]的说明

部署完成后，我可以看到 pod readiness 探测失败

Readiness probe failed: cat: can't open '/tmp/healthy': No such file or directory

Run Code Online (Sandbox Code Playgroud)

这是预期的。然后我向该主题发送了一条kafka消息。我观察到

1）kafka消息已被我的应用程序使用并保存到数据库。
2）其余api无法访问。

我假设如果 pod 的就绪探针失败，应用程序既无法接收 kafka 消息，也无法接收其余请求。但为什么在我的测试中，REST请求和Kafka消息的处理方式不同。

根据 Kubernetes 文档：

The kubelet uses readiness probes to know when a Container is ready to start accepting traffic

Run Code Online (Sandbox Code Playgroud)

但它并没有明确说明它真正意味着什么样的流量。如果就绪探测失败，kubernetes 是否仅限制到 pod 的 http 流量，但不限制 tcp 流量（因为 Kafka 通过 tcp 工作）？

我的实际意图是让我的服务应用程序（kafka 消费者）能够控制何时接收 kafka 消息（以及 REST …

apache-kafka kubernetes readinessprobe

She*_*Liu

2019 11-22

5
推荐指数

1
解决办法

1896
查看次数

如何为worker pod定义k8s liveness探针和readness探针

我有一个 k8s 集群。我们的服务是基于队列的。我们的 Pod 订阅事件队列、获取事件并执行任务。那么对于这种服务，如何定义k8s的liveness探针和readiness探针呢？

kubernetes readinessprobe livenessprobe

Ruo*_*ang

2022 11-13

5
推荐指数

1
解决办法

2674
查看次数

当 pod 包含多个容器时，K8s 的活性探测行为？

场景：一个 K8S Pod 具有多个容器，并且为每个容器配置活性/就绪探针。现在，如果活性探测在某些容器上成功，但在少数容器上失败，k8s 会做什么。

它会仅重新启动失败的容器
还是
它会重新启动整个 Pod 吗？

kubernetes kubernetes-pod readinessprobe livenessprobe

sam*_*ers

lucky-day

5
推荐指数

2
解决办法

8643
查看次数

就绪探针给出停止服务错误

我已将 Spring Boot 更新为2.5.2从2.1.8.RELEASE. 在此之前，活性和就绪性探测都很好。现在，在更新 Spring Boot 后，我更新了我的应用程序属性文件：

management.endpoints.web.exposure.include=*
management.endpoints.jmx.exposure.include=health,info
management.endpoint.metrics.enabled=true
management.endpoint.prometheus.enabled=true
management.metrics.enable.*=true
management.metrics.enable.all=true
management.metrics.export.prometheus.enabled=true
management.metrics.use-global-registry=true




management.endpoint.health.probes.enabled=true
management.health.livenessstate.enabled=true
management.health.readinessstate.enabled=true
management.endpoint.health.show-details=always

Run Code Online (Sandbox Code Playgroud)

我已分别向某些文档添加了最后 4 行。

在本地启动项目并从浏览器检查后：

http://localhost:8090/actuator/health/liveness给出{"status":"UP"}

但 http://localhost:8090/actuator/health/readiness给出{"status":"OUT_OF_SERVICE"}的是 503 状态。

{
    "status": "OUT_OF_SERVICE",
    "components": {
        "diskSpace": {
            "status": "UP",
            "details": {
                "total": 499963174912,
                "free": 326313852928,
                "threshold": 10485760,
                "exists": true
            }
        },
        "livenessState": {
            "status": "UP"
        },
        "ping": {
            "status": "UP"
        },
        "readinessState": {
            "status": "OUT_OF_SERVICE"
        }
    },
    "groups": [
        "liveness",
        "readiness"
    ] …

Run Code Online (Sandbox Code Playgroud)

maven spring-boot kubernetes readinessprobe livenessprobe

Fay*_*007

2021 08-16

5
推荐指数

0
解决办法

3455
查看次数