“kubectl debug”在启用功能门的 1.20 上挂起

koe*_*ehn 1 debugging kubernetes

我有一个 Kubernetes 1.20 集群,带有 kubectl 1.20 并EphemeralContainers启用了功能门。

我正在尝试运行kubectl 调试文档中的命令,但它们似乎无法正常工作。我可以通过以下方式启动一个 Pod:

$ kubectl run ephemeral-demo --image=k8s.gcr.io/pause:3.1 --restart=Never
pod/ephemeral-demo created
Run Code Online (Sandbox Code Playgroud)

当我尝试将调试容器附加到它时:

$ kubectl debug -it ephemeral-demo --image=busybox --target=ephemeral-demo
Defaulting debug container name to debugger-g6pj6.
Run Code Online (Sandbox Code Playgroud)

无论我等了多久或按了多少次 <enter>,我都永远不会收到命令行。如果我检查 pod,我可以看到调试容器存在:

$ kubectl describe pod ephemeral-demo
Name:         ephemeral-demo
Namespace:    nextcloud
Priority:     0
Node:         k8s-htz-worker-02/78.47.15.149
Start Time:   Tue, 15 Dec 2020 06:36:30 -0600
Labels:       run=ephemeral-demo
Annotations:  cni.projectcalico.org/podIP: 10.244.2.186/32
              cni.projectcalico.org/podIPs: 10.244.2.186/32
Status:       Running
IP:           10.244.2.186
IPs:
  IP:  10.244.2.186
Containers:
  ephemeral-demo:
    Container ID:   docker://b6d3ffa3d2ee8eb6a51a3b5ba823392cf57ed836833830510a2625788f8789d6
    Image:          k8s.gcr.io/pause:3.1
    Image ID:       docker-pullable://k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 15 Dec 2020 06:36:32 -0600
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-btnzm (ro)
Ephemeral Containers:
  debugger-g6pj6:
    Image:        busybox
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:       <none>
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-btnzm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-btnzm
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  55s   default-scheduler  Successfully assigned nextcloud/ephemeral-demo to k8s-htz-worker-02
  Normal  Pulled     53s   kubelet            Container image "k8s.gcr.io/pause:3.1" already present on machine
  Normal  Created    53s   kubelet            Created container ephemeral-demo
  Normal  Started    53s   kubelet            Started container ephemeral-demo
Run Code Online (Sandbox Code Playgroud)

但如果我尝试exec这样做,我就会失败:

$ kc exec -it ephemeral-demo -c debugger-g6pj6 -- bash
error: unable to upgrade connection: container not found ("debugger-g6pj6")
Run Code Online (Sandbox Code Playgroud)

我错过了什么吗?

koe*_*ehn 5

解决方案是,当我在主节点 ( /etc/kubernetes/manifests/kube-apiserver.yaml) 上启用功能门时,更改不会传播到集群中的工作节点 ( /var/lib/kubelet/config.yaml)。手动将更改应用到工作节点并重新启动kubelet( systemctl restart kubelet.service) 解决了该问题。