LivenessProbe 失败,但端口转发正在同一端口上工作

Tec*_*kie 3 macos kubernetes minikube

我有以下部署 yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gofirst
  labels:
    app: gofirst
spec:
  selector:
    matchLabels:
      app: gofirst
  template:
    metadata:
      labels:
        app: gofirst
    spec:
      restartPolicy: Always
      containers:
      - name: gofirst
        image: lbvenkatesh/gofirst:0.0.5
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        ports:
        - name: http
          containerPort: 8080
        livenessProbe:
          httpGet:
            path: /health
            port: http
            httpHeaders:
            - name: "X-Health-Check"
              value: "1"
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: http
            httpHeaders:
            - name: "X-Health-Check"
              value: "1"
          initialDelaySeconds: 30
          periodSeconds: 10
Run Code Online (Sandbox Code Playgroud)

我的服务 yaml 是这样的:

apiVersion: v1
kind: Service
metadata:
  name: gofirst
  labels:
    app: gofirst
spec:
  publishNotReadyAddresses: true
  type: NodePort
  selector:
    app: gofirst
  ports:
  - port: 8080
    targetPort: http
    name: http
Run Code Online (Sandbox Code Playgroud)

“gofirst”是一个用 Golang Gin 编写的简单 Web 应用程序。这是相同的 dockerFile:

FROM golang:latest 
LABEL MAINTAINER='Venkatesh Laguduva <lbvenkatesh@gmail.com>'
RUN mkdir /app 
ADD . /app/
RUN apt -y update && apt -y install git
RUN go get github.com/gin-gonic/gin
RUN go get -u github.com/RaMin0/gin-health-check
WORKDIR /app 
RUN go build -o main . 
ARG verArg="0.0.1"
ENV VERSION=$verArg
ENV PORT=8080
ENV GIN_MODE=release
EXPOSE 8080
CMD ["/app/main"]
Run Code Online (Sandbox Code Playgroud)

我已在 Minikube 中部署了此应用程序,当我尝试描述此 pod 时,我看到了以下事件:

  Type     Reason            Age                     From               Message
  ----     ------            ----                    ----               -------
  Warning  FailedScheduling  10m (x2 over 10m)       default-scheduler  0/1 nodes are available: 1 Insufficient cpu.
  Normal   Scheduled         10m                     default-scheduler  Successfully assigned default/gofirst-95fc8668c-6r4qc to m01
  Normal   Pulling           10m                     kubelet, m01       Pulling image "lbvenkatesh/gofirst:0.0.5"
  Normal   Pulled            10m                     kubelet, m01       Successfully pulled image "lbvenkatesh/gofirst:0.0.5"
  Normal   Killing           8m13s (x2 over 9m13s)   kubelet, m01       Container gofirst failed liveness probe, will be restarted
  Normal   Pulled            8m13s (x2 over 9m12s)   kubelet, m01       Container image "lbvenkatesh/gofirst:0.0.5" already present on machine
  Normal   Created           8m12s (x3 over 10m)     kubelet, m01       Created container gofirst
  Normal   Started           8m12s (x3 over 10m)     kubelet, m01       Started container gofirst
  Warning  Unhealthy         7m33s (x7 over 9m33s)   kubelet, m01       Liveness probe failed: Get http://172.17.0.4:8080/health: dial tcp 172.17.0.4:8080: connect: connection refused
  Warning  Unhealthy         5m35s (x12 over 9m25s)  kubelet, m01       Readiness probe failed: Get http://172.17.0.4:8080/health: dial tcp 172.17.0.4:8080: connect: connection refused
  Warning  BackOff           31s (x17 over 4m13s)    kubelet, m01       Back-off restarting failed container
Run Code Online (Sandbox Code Playgroud)

我尝试了示例容器“hello-world”,并且在执行“minikube service hello-world”时运行良好,但是当我尝试使用“minikube service gofirst”进行相同操作时,我在浏览器中收到连接错误。

我必须做一些相对简单的事情,但无法找到错误。请检查我的 yaml 和 docker 文件,如果我犯了任何错误,请告诉我。

Mar*_*ney 5

我重现了您的场景并遇到了与您相同的问题。因此,我决定删除 liveness 和 rediness 探针,以便能够登录 pod 并进行调查。

这是我使用的 yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gofirst
  labels:
    app: gofirst
spec:
  selector:
    matchLabels:
      app: gofirst
  template:
    metadata:
      labels:
        app: gofirst
    spec:
      restartPolicy: Always
      containers:
      - name: gofirst
        image: lbvenkatesh/gofirst:0.0.5
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        ports:
        - name: http
          containerPort: 8080
Run Code Online (Sandbox Code Playgroud)

我登录了 pod 以检查应用程序是否正在侦听您尝试测试的端口:

kubectl exec -ti gofirst-65cfc7556-bbdcg -- bash
Run Code Online (Sandbox Code Playgroud)

然后我安装了netstat:

# apt update
# apt install net-tools
Run Code Online (Sandbox Code Playgroud)

检查应用程序是否正在运行:

# ps -ef 
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 10:06 ?        00:00:00 /app/main
root           9       0  0 10:06 pts/0    00:00:00 sh
root          15       9  0 10:07 pts/0    00:00:00 ps -ef
Run Code Online (Sandbox Code Playgroud)

最后检查 8080 端口是否正在监听:

# netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 127.0.0.1:8080          0.0.0.0:*               LISTEN     
tcp        0      0 10.28.0.9:56106         151.101.184.204:80      TIME_WAIT  
tcp        0      0 10.28.0.9:56130         151.101.184.204:80      TIME_WAIT  
tcp        0      0 10.28.0.9:56104         151.101.184.204:80      TIME_WAIT  
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   Path
Run Code Online (Sandbox Code Playgroud)

正如我们所看到的,应用程序仅监听本地主机连接,而不是来自任何地方。预期输出应该是:0.0.0.0:8080

希望它能帮助您解决问题。