Calico-kube-controller、coredns pod 处于挂起状态

vid*_*ddy 1 kubernetes

我正在尝试部署 Kubernetes 集群,我的主节点已启动并正在运行,但某些 pod 处于挂起状态。以下是 get pod 的输出。

NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE   IP       NODE                NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-65b4876956-29tj9    0/1     Pending   0          9h    <none>   <none>              <none>           <none>
kube-system   calico-node-bf25l                           2/2     Running   2          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   coredns-7d6cf57b54-b55zw                    0/1     Pending   0          9h    <none>   <none>              <none>           <none>
kube-system   coredns-7d6cf57b54-bk6j5                    0/1     Pending   0          12m   <none>   <none>              <none>           <none>
kube-system   kube-apiserver-master-0-eccdtest            1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   kube-controller-manager-master-0-eccdtest   1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   kube-proxy-jhfjj                            1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   kube-scheduler-master-0-eccdtest            1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   openstack-cloud-controller-manager-tlp4m    1/1     CrashLoopBackOff   114        9h    <none>   master-0-eccdtest   <none>           <none>
Run Code Online (Sandbox Code Playgroud)

当我尝试检查 pod 日志时,出现以下错误。

Error from server: no preferred addresses found; known addresses: []
Run Code Online (Sandbox Code Playgroud)

Kubectl get events 有很多警告。

NAMESPACE     LAST SEEN   TYPE      REASON                    KIND   MESSAGE
default       23m         Normal    Starting                  Node   Starting kubelet.
default       23m         Normal    NodeHasSufficientMemory   Node   Node master-0-eccdtest status is now: NodeHasSufficientMemory
default       23m         Normal    NodeHasNoDiskPressure     Node   Node master-0-eccdtest status is now: NodeHasNoDiskPressure
default       23m         Normal    NodeHasSufficientPID      Node   Node master-0-eccdtest status is now: NodeHasSufficientPID
default       23m         Normal    NodeAllocatableEnforced   Node   Updated Node Allocatable limit across pods
default       23m         Normal    Starting                  Node   Starting kube-proxy.
default       23m         Normal    RegisteredNode            Node   Node master-0-eccdtest event: Registered Node master-0-eccdtest in Controller
kube-system   26m         Warning   FailedScheduling          Pod    0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   3m15s       Warning   FailedScheduling          Pod    0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   25m         Warning   DNSConfigForming          Pod    Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m         Normal    SandboxChanged            Pod    Pod sandbox changed, it will be killed and re-created.
kube-system   23m         Normal    Pulled                    Pod    Container image "registry.eccd.local:5000/node:v3.6.1-26684321" already present on machine
kube-system   23m         Normal    Created                   Pod    Created container
kube-system   23m         Normal    Started                   Pod    Started container
kube-system   23m         Normal    Pulled                    Pod    Container image "registry.eccd.local:5000/cni:v3.6.1-26684321" already present on machine
kube-system   23m         Normal    Created                   Pod    Created container
kube-system   23m         Normal    Started                   Pod    Started container
kube-system   23m         Warning   Unhealthy                 Pod    Readiness probe failed: Threshold time for bird readiness check:  30s
calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: dial tcp [::1]:9099: connect: connection refused
kube-system   23m     Warning   Unhealthy          Pod          Liveness probe failed: Get http://localhost:9099/liveness: dial tcp [::1]:9099: connect: connection refused
kube-system   26m     Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   3m15s   Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   105s    Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   26m     Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   22m     Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   21m     Warning   FailedScheduling   Pod          skip schedule deleting pod: kube-system/coredns-7d6cf57b54-w95g4
kube-system   21m     Normal    SuccessfulCreate   ReplicaSet   Created pod: coredns-7d6cf57b54-bk6j5
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-apiserver:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-controller-manager:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   23m     Normal    LeaderElection     Endpoints    master-0-eccdtest_ed8f0ece-a6cd-11e9-9dd7-fa163e182aab became leader
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-proxy:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-scheduler:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   23m     Normal    LeaderElection     Endpoints    master-0-eccdtest_ee2520c1-a6cd-11e9-96a3-fa163e182aab became leader
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   36m     Warning   BackOff            Pod          Back-off restarting failed container
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   20m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/openstack-cloud-controller-manager:v1.14.0-1-11023d82" already present on machine
kube-system   20m     Normal    Created            Pod          Created container
kube-system   20m     Normal    Started            Pod          Started container
kube-system   3m20s   Warning   BackOff            Pod          Back-off restarting failed container
Run Code Online (Sandbox Code Playgroud)

reslov.conf 中唯一的名称服务器是

nameserver 10.96.0.10
Run Code Online (Sandbox Code Playgroud)

我广泛使用谷歌来解决这些问题,但没有得到任何有效的解决方案。任何建议,将不胜感激。

TIA

Vit*_*Vit 5

您的主要问题是 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate警告消息。你会因为node-role.kubernetes.io/master:NoSchedulenode.kubernetes.io/not-ready:NoSchedule 污点而得到这个

此污点会阻止在当前节点上调度 pod。

如果您希望能够在控制平面节点上调度 pod,例如用于开发的单机 Kubernetes 集群,请运行:

kubectl taint nodes instance-1 node-role.kubernetes.io/master-
kubectl taint nodes instance-1 node.kubernetes.io/not-ready:NoSchedule-
Run Code Online (Sandbox Code Playgroud)

但从我的战俘来看,最好是:

-使用 kubeadm 启动集群

-申请CNI

-添加新的工作节点

- 并让所有新 Pod 都安排在工作节点上。

sudo kubeadm init --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.15.0
...

Your Kubernetes control-plane has initialized successfully!

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config


$ kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.extensions/calico-node created
serviceaccount/calico-node created
deployment.extensions/calico-kube-controllers created
serviceaccount/calico-kube-controllers created


-ADD worker node using kubeadm join string on slave node

$ kubectl get nodes
NAME         STATUS   ROLES    AGE   VERSION
instance-1   Ready    master   21m   v1.15.0
instance-2   Ready    <none>   34s   v1.15.0

    $ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE    IP               NODE         NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-658558ddf8-v2rqx   1/1     Running   0          11m    192.168.23.129   instance-1   <none>           <none>
kube-system   calico-node-c2tkt                          1/1     Running   0          11m    10.132.0.36      instance-1   <none>           <none>
kube-system   calico-node-dhc66                          1/1     Running   0          107s   10.132.0.38      instance-2   <none>           <none>
kube-system   coredns-5c98db65d4-dqjm7                   1/1     Running   0          22m    192.168.23.130   instance-1   <none>           <none>
kube-system   coredns-5c98db65d4-hh7vd                   1/1     Running   0          22m    192.168.23.131   instance-1   <none>           <none>
kube-system   etcd-instance-1                            1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-apiserver-instance-1                  1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-controller-manager-instance-1         1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-proxy-qwvkq                           1/1     Running   0          107s   10.132.0.38      instance-2   <none>           <none>
kube-system   kube-proxy-s9gng                           1/1     Running   0          22m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-scheduler-instance-1                  1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <n
Run Code Online (Sandbox Code Playgroud)