重启系统后 Kubernetes 不启动 (Ubuntu)

wal*_*auf 8 ubuntu kubernetes

我在 VirtualBox(Master 和 Node01)中的两个 Ubuntu 上安装了 K8s。安装后(我根据 K8s 文档站点继续)我输入kubectl get nodes并让机器人服务器处于Ready状态。但是在重新启动系统后,我得到了这个:

# kubectl get nodes 
The connection to the server localhost:8080 was refused - did you specify the 
right host or port? 
Run Code Online (Sandbox Code Playgroud)

我检查了 kubelet 服务,它正在运行:

# systemctl status kubelet
kubelet.service - kubelet: The Kubernetes Node Agent 
   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) 
  Drop-In: /etc/systemd/system/kubelet.service.d 
           ??10-kubeadm.conf 
   Active: active (running) since Mon 2017-04-24 10:01:51 CEST; 15min ago 
     Docs: http://kubernetes.io/docs/ 
Main PID: 13128 (kubelet) 
    Tasks: 21 
   Memory: 48.2M 
      CPU: 58.014s 
   CGroup: /system.slice/kubelet.service 
           ??13128 /usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true --pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --cluster-dns=10.96.0.10 --cluster-domain=cluster.local 
           ??13164 journalctl -k -f 

Apr 24 10:16:40 master kubelet[13128]: I0424 10:16:40.204156   13128 kuberuntime_manager.go:752] Back-off 5m0s restarting failed container=weave pod=weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776) 
Apr 24 10:16:40 master kubelet[13128]: E0424 10:16:40.204694   13128 pod_workers.go:182] Error syncing pod 4b7bb2f0-2691-11e7-bfb6-080027229776 ("weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776)"), skipping: fail 
Apr 24 10:16:42 master kubelet[13128]: I0424 10:16:42.972302   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2b59d0d9-2692-11e7-bfb6-080027229776-default-token-h3v7c" (spec.Name: " 
Apr 24 10:16:48 master kubelet[13128]: I0424 10:16:48.949731   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2bb42bc1-2692-11e7-bfb6-080027229776-default-token-h3v7c" (spec.Name: " 
Apr 24 10:16:51 master kubelet[13128]: I0424 10:16:51.978663   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/2b023c31-2692-11e7-bfb6-080027229776-default-token-h3v7c" (spec.Name: " 
Apr 24 10:16:52 master kubelet[13128]: I0424 10:16:52.909589   13128 operation_generator.go:597] MountVolume.SetUp succeeded for volume "kubernetes.io/secret/4b7bb2f0-2691-11e7-bfb6-080027229776-default-token-gslqd" (spec.Name: " 
Apr 24 10:16:53 master kubelet[13128]: I0424 10:16:53.186057   13128 kuberuntime_manager.go:458] Container {Name:weave Image:weaveworks/weave-kube:1.9.4 Command:[/home/weave/launch.sh] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env: 
Apr 24 10:16:53 master kubelet[13128]: I0424 10:16:53.188091   13128 kuberuntime_manager.go:742] checking backoff for container "weave" in pod "weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776)" 
Apr 24 10:16:53 master kubelet[13128]: I0424 10:16:53.188717   13128 kuberuntime_manager.go:752] Back-off 5m0s restarting failed container=weave pod=weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776) 
Apr 24 10:16:53 master kubelet[13128]: E0424 10:16:53.189136   13128 pod_workers.go:182] Error syncing pod 4b7bb2f0-2691-11e7-bfb6-080027229776 ("weave-net-5qgvz_kube-system(4b7bb2f0-2691-11e7-bfb6-080027229776)"), skipping: fail 
Run Code Online (Sandbox Code Playgroud)

这是重新启动 kubelet 的 systemd 日志文件:Google Drive

...我不确定我在 doc 中遗漏了什么或 kubelet 发生了什么。我可以请你帮忙吗?:]

• Ubuntu 版本

cat /etc/os-release 
NAME="Ubuntu" 
VERSION="16.04.2 LTS (Xenial Xerus)" 
ID=ubuntu 
ID_LIKE=debian 
PRETTY_NAME="Ubuntu 16.04.2 LTS" 
VERSION_ID="16.04" 
HOME_URL="http://www.ubuntu.com/" 
SUPPORT_URL="http://help.ubuntu.com/" 
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" 
VERSION_CODENAME=xenial 
UBUNTU_CODENAME=xenial 
Run Code Online (Sandbox Code Playgroud)

• 核心

# uname -a 
Linux ubuntu 4.4.0-72-generic #93-Ubuntu SMP Fri Mar 31 14:07:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 
Run Code Online (Sandbox Code Playgroud)

• Kubectl 版本

# kubectl version 
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:44:38Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} 
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} 
Run Code Online (Sandbox Code Playgroud)

• Kubeadm 版本

# kubeadm version 
kubeadm version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:33:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} 
Run Code Online (Sandbox Code Playgroud)

• Kubelet 版本

# kubelet --version 
Kubernetes v1.6.1 
Run Code Online (Sandbox Code Playgroud)

• Docker 版本

# docker version 
Client: 
Version:      1.11.2 
API version:  1.23 
Go version:   go1.5.4 
Git commit:   b9f10c9 
Built:        Wed Jun  1 22:00:43 2016 
OS/Arch:      linux/amd64 

Server: 
Version:      1.11.2 
API version:  1.23 
Go version:   go1.5.4 
Git commit:   b9f10c9 
Built:        Wed Jun  1 22:00:43 2016 
OS/Arch:      linux/amd64 
Run Code Online (Sandbox Code Playgroud)

Ara*_*thy 15

我在 kubernetes 1.12.3 和 ubuntu 16.04.05 上遇到了同样的问题。然后我通过运行命令查看了 kubernetes 日志

$ journalctl -u kubelet

然后在日志中我看到 k8s 正在抱怨(以状态 255 退出)关于交换正在开启。

所以我然后通过运行关闭交换

$ swapoff -a

然后我编辑了 fstab 并注释掉了交换条目

$ vi /etc/fstab
#comment out line with swap
Run Code Online (Sandbox Code Playgroud)

然后重新启动系统。系统恢复后,我通过运行验证了交换已禁用

$ free -m

并检查交换行是否为 0。确实如此。

然后我通过执行验证 kubeapi 服务已成功启动

$ systemctl status kubelet

它已成功启动。我还通过重新检查 journalctl 日志进行了验证。这次没有看到交换错误。

我通过运行验证了 k8s 节点状态

$ kubectl get nodes

现在正在工作并显示预期的输出。

注意:我之前也在我的 .bash_profile 文件中设置了 KUBECONFIG。

root@k8s-master:~# cat .bash_profile
export KUBECONFIG="/etc/kubernetes/admin.conf"
Run Code Online (Sandbox Code Playgroud)


wal*_*auf 4

我有一个KUBECONFIGkubelet 需要的错误导出变量(历史详细信息在相关评论中)。

~/.zprofile我保存了这KUBECONFIG=$HOME/admin.conf解决了我的问题。

重新加载 ENV 变量后 kubelet 开始工作:

# kubectl get nodes
NAME      STATUS     AGE       VERSION
master    Ready      5d        v1.6.1
node01    NotReady   5d        v1.6.1
Run Code Online (Sandbox Code Playgroud)