K8s NodePort服务仅在群集中的2/4从站上"无法通过IP"

Viv*_*thi 9 kubernetes flannel

我创建了一个由5个虚拟机(1个主服务器和4个运行Ubuntu 16.04.3 LTS的服务器)的K8s集群kubeadm.我曾经flannel在集群中建立网络.我能够成功部署一个应用程序.然后,我通过NodePort服务公开它.从这里开始,事情变得复杂了.

在开始之前,我禁用了firewalldmaster和节点上的默认服务.

据我所知,从K8s Services文档中,NodePort类型在集群中的所有节点上公开服务.但是,当我创建它时,该服务仅在群集中的4个节点中暴露出来.我猜这不是预期的行为(对吧?)

对于故障排除,以下是一些资源规范:

root@vm-vivekse-003:~# kubectl get nodes
NAME              STATUS    AGE       VERSION
vm-deepejai-00b   Ready     5m        v1.7.3
vm-plashkar-006   Ready     4d        v1.7.3
vm-rosnthom-00f   Ready     4d        v1.7.3
vm-vivekse-003    Ready     4d        v1.7.3   //the master
vm-vivekse-004    Ready     16h       v1.7.3

root@vm-vivekse-003:~# kubectl get pods -o wide -n playground
NAME                                     READY     STATUS    RESTARTS   AGE       IP           NODE
kubernetes-bootcamp-2457653786-9qk80     1/1       Running   0          2d        10.244.3.6   vm-rosnthom-00f
springboot-helloworld-2842952983-rw0gc   1/1       Running   0          1d        10.244.3.7   vm-rosnthom-00f

root@vm-vivekse-003:~# kubectl get svc -o wide -n playground
NAME        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE       SELECTOR
sb-hw-svc   10.101.180.19   <nodes>       9000:30847/TCP   5h        run=springboot-helloworld

root@vm-vivekse-003:~# kubectl describe svc sb-hw-svc -n playground
Name:               sb-hw-svc
Namespace:          playground
Labels:             <none>
Annotations:        <none>
Selector:           run=springboot-helloworld
Type:               NodePort
IP:                 10.101.180.19
Port:               <unset>   9000/TCP
NodePort:           <unset>   30847/TCP
Endpoints:          10.244.3.7:9000
Session Affinity:   None
Events:             <none>

root@vm-vivekse-003:~# kubectl get endpoints sb-hw-svc -n playground -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: 2017-08-09T06:28:06Z
  name: sb-hw-svc
  namespace: playground
  resourceVersion: "588958"
  selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc
  uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b
subsets:
- addresses:
  - ip: 10.244.3.7
    nodeName: vm-rosnthom-00f
    targetRef:
      kind: Pod
      name: springboot-helloworld-2842952983-rw0gc
      namespace: playground
      resourceVersion: "473859"
      uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b
  ports:
  - port: 9000
    protocol: TCP
Run Code Online (Sandbox Code Playgroud)

经过一些修补后,我意识到在那两个"故障"节点上,这些服务并不是从这些主机本身可用的.

Node01(工作):

root@vm-vivekse-004:~# curl 127.0.0.1:30847      //<localhost>:<nodeport>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.101.180.19:9000   //<cluster-ip>:<port>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.244.3.7:9000      //<pod-ip>:<port>
Hello Docker World!!
Run Code Online (Sandbox Code Playgroud)

Node02(工作):

root@vm-rosnthom-00f:~# curl 127.0.0.1:30847
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.101.180.19:9000
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.244.3.7:9000
Hello Docker World!!
Run Code Online (Sandbox Code Playgroud)

Node03(不工作):

root@vm-plashkar-006:~# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-plashkar-006:~# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-plashkar-006:~# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out
Run Code Online (Sandbox Code Playgroud)

Node04(不工作):

root@vm-deepejai-00b:/# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-deepejai-00b:/# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-deepejai-00b:/# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out
Run Code Online (Sandbox Code Playgroud)

尝试netstattelnet所有4个奴隶.这是输出:

Node01(工作主机):

root@vm-vivekse-004:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      27808/kube-proxy
root@vm-vivekse-004:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Run Code Online (Sandbox Code Playgroud)

Node02(工作主机):

root@vm-rosnthom-00f:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      11842/kube-proxy
root@vm-rosnthom-00f:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Run Code Online (Sandbox Code Playgroud)

Node03(不工作的主机):

root@vm-plashkar-006:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      7791/kube-proxy
root@vm-plashkar-006:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out
Run Code Online (Sandbox Code Playgroud)

Node04(不工作的主机):

root@vm-deepejai-00b:/# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      689/kube-proxy
root@vm-deepejai-00b:/# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out
Run Code Online (Sandbox Code Playgroud)

添加信息:

kubectl get pods输出中,我可以看到pod实际部署在slave上vm-rosnthom-00f.我可以ping从所有5个虚拟机中使用此主机,curl vm-rosnthom-00f:30847也可以在所有虚拟机中运行.

我可以清楚地看到内部群集网络搞砸了,但我不确定如何解决它!iptables -L对于所有从站都是相同的,甚至本地环回(ifconfig lo)也为所有从站启动并运行.我完全不知道如何解决它!

sfg*_*ups -4

如果您想从集群中的任何节点访问服务,您需要精细的服务类型为ClusterIP。由于您将服务类型定义为NodePort,因此您可以从运行服务的节点进行连接。

\n\n
\n\n

我的上述答案不正确,根据文档,我们应该能够从任何地方连接NodeIP:Nodeport. 但它在我的集群中也不起作用。

\n\n

https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services---service-types

\n\n
\n

NodePort:在静态端口(NodePort)上公开每个 Node\xe2\x80\x99s IP 上的服务。NodePort 服务将路由到的 ClusterIP 服务将自动创建。您\xe2\x80\x99 将能够通过请求\n : 从集群外部联系\n NodePort 服务。

\n
\n\n

我的一个节点 ip 转发未设置。我能够使用 NodeIP:nodePort 连接我的服务

\n\n
sysctl -w net.ipv4.ip_forward=1\n
Run Code Online (Sandbox Code Playgroud)\n