我有一个运行正常的 K8s 集群,但由于电源故障,所有节点都重新启动。
目前我在恢复主节点(和其他节点)时遇到了一些问题:
sudo systemctl kubelet status正在返回,Unknown operation kubelet.但是当我运行kubeadm init ...(我设置集群的命令)时,它返回:error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-6443]: Port 6443 is in use
[ERROR Port-10251]: Port 10251 is in use
[ERROR Port-10252]: Port 10252 is in use
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR Port-2379]: Port 2379 is in use
[ERROR Port-2380]: Port 2380 is in use
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
Run Code Online (Sandbox Code Playgroud)
当我检查这些端口时,我可以看到 kubelet 和其他 K8s 组件正在使用它们:
~/k8s-multi-node$ sudo lsof -i :10251
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kube-sche 26292 root 3u IPv6 104933 0t0 TCP *:10251 (LISTEN)
~/k8s-multi-node$ sudo lsof -i :10252
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kube-cont 26256 root 3u IPv6 115541 0t0 TCP *:10252 (LISTEN)
~/k8s-multi-node$ sudo lsof -i :10250
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kubelet 24781 root 27u IPv6 106821 0t0 TCP *:10250 (LISTEN)
Run Code Online (Sandbox Code Playgroud)
我试图杀死他们,但他们又开始使用这些端口。
那么恢复这样一个集群的正确方法是什么?我是否需要删除 kubelet 和所有 otehr 组件并重新安装它们?
| 归档时间: |
|
| 查看次数: |
2729 次 |
| 最近记录: |