Kube-Proxy 和 Kube-Flannel CrashLoopBackOff

Question

Kube-Proxy 和 Kube-Flannel CrashLoopBackOff

Rah*_*sha 3 kubernetes flannel kube-proxy

我在本地服务器中有一个 Kubernetes 集群，我在Naver Cloud上也有一个服务器，我们可以称之为server A，我想加入我的server AKubernetes 集群，服务器可以正常加入，但是从 daemonset 生成的 podkube-proxy和kube-flannelpod 一直在CrashLoopBackOff地位

这是来自的日志kube-proxy

I0405 03:13:48.566285       1 node.go:163] Successfully retrieved node IP: 10.1.0.2
I0405 03:13:48.566382       1 server_others.go:109] "Detected node IP" address="10.1.0.2"
I0405 03:13:48.566420       1 server_others.go:535] "Using iptables proxy"
I0405 03:13:48.616989       1 server_others.go:176] "Using iptables Proxier"
I0405 03:13:48.617021       1 server_others.go:183] "kube-proxy running in dual-stack mode" ipFamily=IPv4
I0405 03:13:48.617040       1 server_others.go:184] "Creating dualStackProxier for iptables"
I0405 03:13:48.617063       1 server_others.go:465] "Detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6"
I0405 03:13:48.617093       1 proxier.go:242] "Setting route_localnet=1 to allow node-ports on localhost; to change this either disable iptables.localhostNodePorts (--iptables-localhost-nodeports) or set nodePortAddresses (--nodeport-addresses) to filter loopback addresses"
I0405 03:13:48.617420       1 server.go:655] "Version info" version="v1.26.0"
I0405 03:13:48.617435       1 server.go:657] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0405 03:13:48.618790       1 conntrack.go:52] "Setting nf_conntrack_max" nf_conntrack_max=131072

Run Code Online (Sandbox Code Playgroud)

没有来自的日志kube-flannel，kube-flannelpods 在名为的 Init 容器上失败install-cni-plugin，当我尝试kubectl -n kube-flannel logs kube-flannel-ds-d2l4q -c install-cni-plugin返回时

unable to retrieve container logs for docker://47e4c8c580474b384b128c8e4d74297a0e891b5f227c6313146908b06ee7b376

Run Code Online (Sandbox Code Playgroud)

我没有其他线索，请告诉我是否需要附加更多信息

请帮忙，我被困了这么久TT

更多信息：

kubectl get nodes

NAME                      STATUS     ROLES           AGE   VERSION
accio-randi-ed05937533    Ready      <none>          8d    v1.26.3
accio-test-1-b3fb4331ee   NotReady   <none>          89m   v1.26.3
master                    Ready      control-plane   48d   v1.26.1

Run Code Online (Sandbox Code Playgroud)

kubectl -n kube-system get pods

NAME                             READY   STATUS             RESTARTS         AGE
coredns-787d4945fb-rms6t         1/1     Running            0                30d
coredns-787d4945fb-t6g8s         1/1     Running            0                33d
etcd-master                      1/1     Running            168 (36d ago)    48d
kube-apiserver-master            1/1     Running            158 (36d ago)    48d
kube-controller-manager-master   1/1     Running            27 (6d17h ago)   48d
kube-proxy-2r8tn                 1/1     Running            6 (36d ago)      48d
kube-proxy-f997t                 0/1     CrashLoopBackOff   39 (90s ago)     87m
kube-proxy-wc9x5                 1/1     Running            0                8d
kube-scheduler-master            1/1     Running            27 (6d17h ago)   48d

Run Code Online (Sandbox Code Playgroud)

kubectl -n kube-system get events

LAST SEEN   TYPE      REASON             OBJECT                               MESSAGE
42s         Warning   DNSConfigForming   pod/coredns-787d4945fb-rms6t         Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
54s         Warning   DNSConfigForming   pod/coredns-787d4945fb-t6g8s         Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
3m10s       Warning   DNSConfigForming   pod/etcd-master                      Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
2m48s       Warning   DNSConfigForming   pod/kube-apiserver-master            Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
3m33s       Warning   DNSConfigForming   pod/kube-controller-manager-master   Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
3m7s        Warning   DNSConfigForming   pod/kube-proxy-2r8tn                 Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3
15s         Normal    SandboxChanged     pod/kube-proxy-f997t                 Pod sandbox changed, it will be killed and re-created.
5m15s       Warning   BackOff            pod/kube-proxy-f997t                 Back-off restarting failed container kube-proxy in pod kube-proxy-f997t_kube-system(7652a1c4-9517-4a8a-a736-1f746f36c7ab)
3m30s       Warning   DNSConfigForming   pod/kube-scheduler-master            Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3

Run Code Online (Sandbox Code Playgroud)

kubectl -n kube-flannel get pods

NAME                    READY   STATUS                  RESTARTS      AGE
kube-flannel-ds-2xgbw   1/1     Running                 0             8d
kube-flannel-ds-htgts   0/1     Init:CrashLoopBackOff   0 (2s ago)    88m
kube-flannel-ds-sznbq   1/1     Running                 6 (36d ago)   48d

Run Code Online (Sandbox Code Playgroud)

kubectl -n kube-flannel get events

LAST SEEN   TYPE      REASON             OBJECT                      MESSAGE
100s        Normal    SandboxChanged     pod/kube-flannel-ds-htgts   Pod sandbox changed, it will be killed and re-created.
26m         Normal    Pulled             pod/kube-flannel-ds-htgts   Container image "docker.io/flannel/flannel-cni-plugin:v1.1.2" already present on machine
46m         Warning   BackOff            pod/kube-flannel-ds-htgts   Back-off restarting failed container install-cni-plugin in pod kube-flannel-ds-htgts_kube-flannel(4f602997-5502-4dcf-8fca-23eba01325dd)
5m          Warning   DNSConfigForming   pod/kube-flannel-ds-sznbq   Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.8.0.1 192.168.18.1 fe80::1%3

Run Code Online (Sandbox Code Playgroud)

kubectl -n kube-flannel describe pod kube-flannel-ds-htgts

Name:                 kube-flannel-ds-htgts
Namespace:            kube-flannel
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      flannel
Node:                 accio-test-1-b3fb4331ee/10.1.0.2
Start Time:           Thu, 06 Apr 2023 09:25:12 +0900
Labels:               app=flannel
                      controller-revision-hash=6b7b59d784
                      k8s-app=flannel
                      pod-template-generation=1
                      tier=node
Annotations:          <none>
Status:               Pending
IP:                   10.1.0.2
IPs:
  IP:           10.1.0.2
Controlled By:  DaemonSet/kube-flannel-ds
Init Containers:
  install-cni-plugin:
    Container ID:  docker://0fed30cc41f305203bf5d6fb7668f92f449a65f722faf1360e61231e9107ef66
    Image:         docker.io/flannel/flannel-cni-plugin:v1.1.2
    Image ID:      docker-pullable://flannel/flannel-cni-plugin@sha256:bf4b62b131666d040f35a327d906ee5a3418280b68a88d9b9c7e828057210443
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
    Args:
      -f
      /flannel
      /opt/cni/bin/flannel
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 06 Apr 2023 15:11:34 +0900
      Finished:     Thu, 06 Apr 2023 15:11:34 +0900
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /opt/cni/bin from cni-plugin (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gbk6z (ro)
  install-cni:
    Container ID:
    Image:         docker.io/flannel/flannel:v0.21.0
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
    Args:
      -f
      /etc/kube-flannel/cni-conf.json
      /etc/cni/net.d/10-flannel.conflist
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/cni/net.d from cni (rw)
      /etc/kube-flannel/ from flannel-cfg (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gbk6z (ro)
Containers:
  kube-flannel:
    Container ID:
    Image:         docker.io/flannel/flannel:v0.21.0
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/bin/flanneld
    Args:
      --ip-masq
      --kube-subnet-mgr
      --iface=accio-k8s-net
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_NAME:                 kube-flannel-ds-htgts (v1:metadata.name)
      POD_NAMESPACE:            kube-flannel (v1:metadata.namespace)
      KUBERNETES_SERVICE_HOST:  10.1.0.1
      KUBERNETES_SERVICE_PORT:  6443
      EVENT_QUEUE_DEPTH:        5000
    Mounts:
      /etc/kube-flannel/ from flannel-cfg (rw)
      /run/flannel from run (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gbk6z (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  run:
    Type:          HostPath (bare host directory volume)
    Path:          /run/flannel
    HostPathType:
  cni-plugin:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:
  cni:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:
  flannel-cfg:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kube-flannel-cfg
    Optional:  false
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  kube-api-access-gbk6z:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 :NoSchedule op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason          Age                      From     Message
  ----     ------          ----                     ----     -------
  Warning  BackOff         31m (x8482 over 5h46m)   kubelet  Back-off restarting failed container install-cni-plugin in pod kube-flannel-ds-htgts_kube-flannel(4f602997-5502-4dcf-8fca-23eba01325dd)
  Normal   Created         21m (x8783 over 5h46m)   kubelet  Created container install-cni-plugin
  Normal   Pulled          11m (x9051 over 5h46m)   kubelet  Container image "docker.io/flannel/flannel-cni-plugin:v1.1.2" already present on machine
  Normal   SandboxChanged  81s (x18656 over 5h46m)  kubelet  Pod sandbox changed, it will be killed and re-created.

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 5

由于容器运行时配置不正确，我的一个节点也遇到了类似的问题。请检查containerd位于/etc/containerd/config.toml指定守护程序级别选项的配置。可以通过运行生成默认配置

containerd config default > /etc/containerd/config.toml

Run Code Online (Sandbox Code Playgroud)

/etc/containerd/config.toml要在with中使用 systemd cgroup 驱动程序runc，请设置

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
 [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

Run Code Online (Sandbox Code Playgroud)

如果 cgroup 驱动程序不正确，可能会导致该节点中的 pod 始终处于CrashLoopBackOff.

归档时间：	2 年，10 月前
查看次数：	2690 次
最近记录：	2 年，7 月前