kubedns 无法以“无法列出 *v1.Endpoints:未经授权”和“无法列出 *v1.Service:未经授权”启动

fra*_*ank 5 kubernetes kube-dns

我在安装 kube-dns 插件时遇到问题。我的操作系统是 CentOS Linux 版本 7.0.1406(核心)

Kernel:Linux master 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Run Code Online (Sandbox Code Playgroud)

我的 api 服务器配置:

###
# kubernetes system config
#
# The following values are used to configure the kube-apiserver
#

# The address on the local server to listen to.
#KUBE_API_ADDRESS="--insecure-bind-address=177.1.1.40"

# The port on the local server to listen on.
KUBE_API_PORT="--secure-port=443"

# Port minions listen on
KUBELET_PORT="--kubelet-port=10250"

# Comma separated list of nodes in the etcd cluster
KUBE_ETCD_SERVERS="--etcd-servers=http://master:2379"

# Address range to use for services
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.254.0.0/16"

# default admission control policies
KUBE_ADMISSION_CONTROL="--enable-admission-plugins=AlwaysAdmit,NamespaceLifecycle,LimitRanger,SecurityContextDeny,ResourceQuota,ServiceAccount"

# Add your own!
KUBE_API_ARGS="--client-ca-file=/etc/kubernetes/k8s-certs/CA/rootCA.crt --tls-private-key-file=/etc/kubernetes/k8s-certs/master/api-server.pem --tls-cert-file=/etc/kubernetes/k8s-certs/master/api-server.crt"
Run Code Online (Sandbox Code Playgroud)

api-server 授权模式设置为 AlwaysAllow

Sep 29 17:31:22 master kube-apiserver: I0929 17:31:22.952730    1311 flags.go:27] FLAG: --authorization-mode="AlwaysAllow"
Run Code Online (Sandbox Code Playgroud)

我的 kube-dns 配置 YAML 文件是:

# Copyright 2016 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "KubeDNS"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.254.0.10
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP
---
#apiVersion: rbac.authorization.k8s.io/v1
#kind: RoleBinding
#metadata:
#  name: kube-dns
#  namespace: kube-system
#roleRef:
#  apiGroup: rbac.authorization.k8s.io
#  kind: ClusterRole
#  name: Cluster-admin
#subjects:
#- kind: ServiceAccount
#  name: default
#  namespace: kube-system
#---
#apiVersion: v1
#kind: ServiceAccount
#metadata:
#  name: kube-dns
#  namespace: kube-system
#  labels:
#    kubernetes.io/cluster-service: "true"
#    addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: kubecfg-file
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
data:
  kubecfg-file: |
    apiVersion: v1
    kind: Config
    clusters:
      - cluster:
          certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          server: https://177.1.1.40:443
        name: kube-test
    users:
    - name: kube-admin
      user:
        token: /var/run/secrets/kubernetes.io/serviceaccount/token
    contexts:
    - context:
        cluster: kube-test
        namespace: default
        user: kube-admin
      name: test-context
    current-context: test-context
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  # replicas: not specified here:
  # 1. In order to make Addon Manager do not reconcile this replicas parameter.
  # 2. Default is 1.
  # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
  strategy:
    rollingUpdate:
      maxSurge: 10%
      maxUnavailable: 0
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      tolerations:
      - key: "CriticalAddonsOnly"
        operator: "Exists"
      volumes:
      - name: kube-dns-config
        configMap:
          name: kube-dns
          optional: true
      - name: kube-kubecfg-file
        configMap:
          name: kubecfg-file
          optional: true
      containers:
      - name: kubedns
        image: 177.1.1.35/library/kube-dns:1.14.8
        resources:
          # TODO: Set memory limits when we've profiled the container for large
          # clusters, then set request = limit to keep this container in
          # guaranteed class. Currently, this container falls into the
          # "burstable" category so the kubelet doesn't backoff from restarting it.
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        livenessProbe:
          httpGet:
            path: /healthcheck/kubedns
            port: 10054
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /readiness
            port: 8081
            scheme: HTTP
          # we poll on pod startup for the Kubernetes master service and
          # only setup the /readiness HTTP server once that's available.
          initialDelaySeconds: 3
          timeoutSeconds: 5
        args:
        - --domain=cluster.local.
        - --dns-port=10053
        - --config-dir=/kube-dns-config
        - --kubecfg-file=/kubecfg-file/kubecfg-file
        - --kube-master-url=https://10.254.0.1:443
        - --v=2
        env:
        - name: PROMETHEUS_PORT
          value: "10055"
        ports:
        - containerPort: 10053
          name: dns-local
          protocol: UDP
        - containerPort: 10053
          name: dns-tcp-local
          protocol: TCP
        - containerPort: 10055
          name: metrics
          protocol: TCP
        volumeMounts:
        - name: kube-dns-config
          mountPath: /kube-dns-config
        - name: kube-kubecfg-file
          mountPath: /kubecfg-file
      - name: dnsmasq
        image: 177.1.1.35/library/dnsmasq:1.14.8
        livenessProbe:
          httpGet:
            path: /healthcheck/dnsmasq
            port: 10054
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        args:
        - -v=2
        - -logtostderr
        - -configDir=/etc/k8s/dns/dnsmasq-nanny
        - -restartDnsmasq=true
        - --
        - -k
        - --cache-size=1000
        - --no-negcache
        - --log-facility=-
        - --server=/cluster.local/127.0.0.1#10053
        - --server=/in-addr.arpa/127.0.0.1#10053
        - --server=/ip6.arpa/127.0.0.1#10053
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        # see: https://github.com/kubernetes/kubernetes/issues/29055 for details
        resources:
          requests:
            cpu: 150m
            memory: 20Mi
        volumeMounts:
        - name: kube-dns-config
          mountPath: /etc/k8s/dns/dnsmasq-nanny
      - name: sidecar
        image: 177.1.1.35/library/sidecar:1.14.8
        livenessProbe:
          httpGet:
            path: /metrics
            port: 10054
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        args:
        - --v=2
        - --logtostderr
        - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV
        - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV
        ports:
        - containerPort: 10054
          name: metrics
          protocol: TCP
        resources:
          requests:
            memory: 20Mi
            cpu: 10m
      dnsPolicy: Default  # Don't use cluster DNS.
      #serviceAccountName: kube-dns
Run Code Online (Sandbox Code Playgroud)

当我启动 kube-dns 时,kubedns 容器日志:

I0929 09:33:22.666182       1 dns.go:48] version: 1.14.8
I0929 09:33:22.668521       1 server.go:71] Using configuration read from directory: /kube-dns-config with period 10s
I0929 09:33:22.668586       1 server.go:119] FLAG: --alsologtostderr="false"
I0929 09:33:22.668604       1 server.go:119] FLAG: --config-dir="/kube-dns-config"
I0929 09:33:22.668613       1 server.go:119] FLAG: --config-map=""
I0929 09:33:22.668619       1 server.go:119] FLAG: --config-map-namespace="kube-system"
I0929 09:33:22.668629       1 server.go:119] FLAG: --config-period="10s"
I0929 09:33:22.668637       1 server.go:119] FLAG: --dns-bind-address="0.0.0.0"
I0929 09:33:22.668643       1 server.go:119] FLAG: --dns-port="10053"
I0929 09:33:22.668662       1 server.go:119] FLAG: --domain="cluster.local."
I0929 09:33:22.668671       1 server.go:119] FLAG: --federations=""
I0929 09:33:22.668683       1 server.go:119] FLAG: --healthz-port="8081"
I0929 09:33:22.668689       1 server.go:119] FLAG: --initial-sync-timeout="1m0s"
I0929 09:33:22.668695       1 server.go:119] FLAG: --kube-master-url="https://10.254.0.1:443"
I0929 09:33:22.668707       1 server.go:119] FLAG: --kubecfg-file="/kubecfg-file/kubecfg-file"
I0929 09:33:22.668714       1 server.go:119] FLAG: --log-backtrace-at=":0"
I0929 09:33:22.668727       1 server.go:119] FLAG: --log-dir=""
I0929 09:33:22.668733       1 server.go:119] FLAG: --log-flush-frequency="5s"
I0929 09:33:22.668739       1 server.go:119] FLAG: --logtostderr="true"
I0929 09:33:22.668744       1 server.go:119] FLAG: --nameservers=""
I0929 09:33:22.668754       1 server.go:119] FLAG: --stderrthreshold="2"
I0929 09:33:22.668760       1 server.go:119] FLAG: --v="2"
I0929 09:33:22.668765       1 server.go:119] FLAG: --version="false"
I0929 09:33:22.668774       1 server.go:119] FLAG: --vmodule=""
I0929 09:33:22.668831       1 server.go:201] Starting SkyDNS server (0.0.0.0:10053)
I0929 09:33:22.669125       1 server.go:220] Skydns metrics enabled (/metrics:10055)
I0929 09:33:22.669170       1 dns.go:146] Starting endpointsController
I0929 09:33:22.669181       1 dns.go:149] Starting serviceController
I0929 09:33:22.669508       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0929 09:33:22.669523       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
E0929 09:33:22.695489       1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:147: Failed to list *v1.Endpoints: Unauthorized
E0929 09:33:22.696267       1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:150: Failed to list *v1.Service: Unauthorized
I0929 09:33:23.169540       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0929 09:33:23.670206       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
Run Code Online (Sandbox Code Playgroud)

几分钟后,吊舱崩溃了。

kubectl describe pod -n kube-system kube-dns-b7d556f59-h8xqp 
Name:           kube-dns-b7d556f59-h8xqp
Namespace:      kube-system
Node:           node3/177.1.1.43
Start Time:     Sat, 29 Sep 2018 17:50:17 +0800
Labels:         k8s-app=kube-dns
                pod-template-hash=638112915
Annotations:    scheduler.alpha.kubernetes.io/critical-pod=
Status:         Running
IP:             172.30.59.3
Controlled By:  ReplicaSet/kube-dns-b7d556f59
Containers:
  kubedns:
    Container ID:  docker://5d62497e0c966c08d4d8c56f7a52e2046fd05b57ec0daf34a7e3cd813e491f09
    Image:         177.1.1.35/library/kube-dns:1.14.8
    Image ID:      docker-pullable://177.1.1.35/library/kube-dns@sha256:6d8e0da4fb46e9ea2034a3f4cab0e095618a2ead78720c12e791342738e5f85d
    Ports:         10053/UDP, 10053/TCP, 10055/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      --domain=cluster.local.
      --dns-port=10053
      --config-dir=/kube-dns-config
      --kubecfg-file=/kubecfg-file/kubecfg-file
      --kube-master-url=https://10.254.0.1:443
      --v=2
    State:          Running
      Started:      Sat, 29 Sep 2018 17:50:20 +0800
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:      100m
      memory:   70Mi
    Liveness:   http-get http://:10054/healthcheck/kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:  http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
    Environment:
      PROMETHEUS_PORT:  10055
    Mounts:
      /kube-dns-config from kube-dns-config (rw)
      /kubecfg-file from kube-kubecfg-file (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dsxql (ro)
  dnsmasq:
    Container ID:  docker://17ae73b52eb69c35a027cb5645a3801d649b262a8650862d64e7959a22c8e92e
    Image:         177.1.1.35/library/dnsmasq:1.14.8
    Image ID:      docker-pullable://177.1.1.35/library/dnsmasq@sha256:93c827f018cf3322f1ff2aa80324a0306048b0a69bc274e423071fb0d2d29d8b
    Ports:         53/UDP, 53/TCP
    Host Ports:    0/UDP, 0/TCP
    Args:
      -v=2
      -logtostderr
      -configDir=/etc/k8s/dns/dnsmasq-nanny
      -restartDnsmasq=true
      --
      -k
      --cache-size=1000
      --no-negcache
      --log-facility=-
      --server=/cluster.local/127.0.0.1#10053
      --server=/in-addr.arpa/127.0.0.1#10053
      --server=/ip6.arpa/127.0.0.1#10053
    State:          Running
      Started:      Sat, 29 Sep 2018 17:50:21 +0800
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        150m
      memory:     20Mi
    Liveness:     http-get http://:10054/healthcheck/dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:  <none>
    Mounts:
      /etc/k8s/dns/dnsmasq-nanny from kube-dns-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dsxql (ro)
  sidecar:
    Container ID:  docker://9449b13ff4e4ba1331d181bd6f309a34a4f3da1ce536c61af7a65664e3ad803a
    Image:         177.1.1.35/library/sidecar:1.14.8
    Image ID:      docker-pullable://177.1.1.35/library/sidecar@sha256:23df717980b4aa08d2da6c4cfa327f1b730d92ec9cf740959d2d5911830d82fb
    Port:          10054/TCP
    Host Port:     0/TCP
    Args:
      --v=2
      --logtostderr
      --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV
      --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV
    State:          Running
      Started:      Sat, 29 Sep 2018 17:50:22 +0800
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        10m
      memory:     20Mi
    Liveness:     http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dsxql (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  kube-dns-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kube-dns
    Optional:  true
  kube-kubecfg-file:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kubecfg-file
    Optional:  true
  default-token-dsxql:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-dsxql
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     CriticalAddonsOnly
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age              From               Message
  ----     ------                 ----             ----               -------
  Normal   SuccessfulMountVolume  3m               kubelet, node3     MountVolume.SetUp succeeded for volume "kube-dns-config"
  Normal   SuccessfulMountVolume  3m               kubelet, node3     MountVolume.SetUp succeeded for volume "kube-kubecfg-file"
  Normal   SuccessfulMountVolume  3m               kubelet, node3     MountVolume.SetUp succeeded for volume "default-token-dsxql"
  Normal   Pulled                 3m               kubelet, node3     Container image "177.1.1.35/library/kube-dns:1.14.8" already present on machine
  Normal   Created                3m               kubelet, node3     Created container
  Normal   Started                3m               kubelet, node3     Started container
  Normal   Pulled                 3m               kubelet, node3     Container image "177.1.1.35/library/dnsmasq:1.14.8" already present on machine
  Normal   Created                3m               kubelet, node3     Created container
  Normal   Started                3m               kubelet, node3     Started container
  Normal   Pulled                 3m               kubelet, node3     Container image "177.1.1.35/library/sidecar:1.14.8" already present on machine
  Normal   Created                3m               kubelet, node3     Created container
  Normal   Started                3m               kubelet, node3     Started container
  Warning  Unhealthy              3m (x3 over 3m)  kubelet, node3     Readiness probe failed: Get http://172.30.59.3:8081/readiness: dial tcp 172.30.59.3:8081: getsockopt: connection refused
  Normal   Scheduled              43s              default-scheduler  Successfully assigned kube-dns-b7d556f59-h8xqp to node3
Run Code Online (Sandbox Code Playgroud)

我的 Kubernetes 版本是:

kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"archive", BuildDate:"2018-03-29T08:38:42Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"archive", BuildDate:"2018-03-29T08:38:42Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Run Code Online (Sandbox Code Playgroud)

码头工人版本:

docker version
Client:
 Version:         1.13.1
 API version:     1.26
 Package version: <unknown>
 Go version:      go1.8.3
 Git commit:      774336d/1.13.1
 Built:           Wed Mar  7 17:06:16 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: <unknown>
 Go version:      go1.8.3
 Git commit:      774336d/1.13.1
 Built:           Wed Mar  7 17:06:16 2018
 OS/Arch:         linux/amd64
 Experimental:    false
Run Code Online (Sandbox Code Playgroud)

我的 kubernetes 服务配置:api-server

/usr/bin/kube-apiserver --logtostderr=true --v=4 --etcd-servers=http://master:2379 --secure-port=443 --kubelet-port=10250 --allow-privileged=false --service-cluster-ip-range=10.254.0.0/16 --enable-admission-plugins=AlwaysAdmit,NamespaceLifecycle,LimitRanger,SecurityContextDeny,ResourceQuota,ServiceAccount --client-ca-file=/etc/kubernetes/k8s-certs/CA/rootCA.crt --tls-private-key-file=/etc/kubernetes/k8s-certs/master/api-server.pem --tls-cert-file=/etc/kubernetes/k8s-certs/master/api-server.crt
Run Code Online (Sandbox Code Playgroud)

控制器管理器:

/usr/bin/kube-controller-manager --logtostderr=true --v=4 --master=https://master:443 --root-ca-file=/etc/kubernetes/k8s-certs/CA/rootCA.crt --service-account-private-key-file=/etc/kubrnetes/k8s-certs/master/api-server.pem --kubeconfig=/etc/kubernetes/cs_kubeconfig

###
# kubernetes system config
#
# The following values are used to configure the kube-apiserver
#