kubelet.service:来自 kubernetes 集群的单元在未就绪状态节点错误中进入失败状态

Mr.*_*Eng 5 kubernetes kubelet

我正在尝试在具有 1 个主节点和 2 个工作节点的 kubernetes 集群中部署 springboot 微服务。当我尝试使用命令获取节点状态时sudo kubectl get nodes,我发现我的工作节点之一尚未准备好。状态显示未就绪。

当我申请对以下命令进行故障排除时,

sudo journalctl -u kubelet
Run Code Online (Sandbox Code Playgroud)

我收到类似kubelet.service: Unit entered failed statekubelet 服务已停止的响应。以下是我在应用命令时得到的响应sudo journalctl -u kubelet

-- Logs begin at Fri 2020-01-03 04:56:18 EST, end at Fri 2020-01-03 05:32:47 EST. --
Jan 03 04:56:25 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:31 MILDEVKUB050 kubelet[970]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --confi
Jan 03 04:56:31 MILDEVKUB050 kubelet[970]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --confi
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.053962     970 server.go:416] Version: v1.17.0
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.084061     970 plugins.go:100] No cloud provider specified.
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.235928     970 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.280173     970 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curre
Jan 03 04:56:38 MILDEVKUB050 kubelet[970]: I0103 04:56:38.107966     970 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
Jan 03 04:56:38 MILDEVKUB050 kubelet[970]: F0103 04:56:38.109401     970 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable swa
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.901632    1433 server.go:416] Version: v1.17.0
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.907654    1433 plugins.go:100] No cloud provider specified.
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.907806    1433 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.947107    1433 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curr
Jan 03 04:56:49 MILDEVKUB050 kubelet[1433]: I0103 04:56:49.263777    1433 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to
Jan 03 04:56:49 MILDEVKUB050 kubelet[1433]: F0103 04:56:49.264219    1433 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable sw
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.712729    1500 server.go:416] Version: v1.17.0
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.714927    1500 plugins.go:100] No cloud provider specified.
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.715248    1500 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.763508    1500 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curr
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.956706    1500 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: F0103 04:56:59.957078    1500 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable sw
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Run Code Online (Sandbox Code Playgroud)

日志文件:服务:单元进入失败状态

我尝试重新启动 kubelet。但节点状态仍然没有变化。仅未就绪状态。

更新

当我尝试该命令时 systemctl list-units --type=swap --state=active,我收到以下响应,

docker@MILDEVKUB040:~$ systemctl list-units --type=swap --state=active
UNIT                                            LOAD   ACTIVE SUB    DESCRIPTION
dev-mapper-MILDEVDCR01\x2d\x2dvg\x2dswap_1.swap loaded active active /dev/mapper/MILDEVDCR01--vg-swap_1
Run Code Online (Sandbox Code Playgroud)

重要的

当我遇到此类节点未准备好的问题时,每次我都需要禁用交换并需要重新加载守护进程和 kubelet。之后该节点变为就绪状态。我需要再次重复同样的事情。

我怎样才能找到永久的解决方案?

Sha*_*k V 5

failed to run Kubelet: running with swap on is not supported, please disable swap

您需要禁用系统上的交换才能使 kubelet 正常工作。您可以禁用交换sudo swapoff -a

对于基于systemd的系统,还有另一种使用交换单元启用交换分区的方法,即使您已使用以下命令关闭交换,只要 systemd 重新加载,该方法也会启用swapoff -a

https://www.freedesktop.org/software/systemd/man/systemd.swap.html

检查您是否有任何交换单元systemctl list-units --type=swap --state=active

您可以使用 永久禁用任何活动交换单元systemctl mask <unit name>

注意:不要使用systemctl disable <unit name>禁用交换单元,因为当 systemd 重新加载时交换单元将再次激活。systemctl mask <unit name>仅使用。

为了确保当系统由于电源循环或任何其他原因重新启动时交换不会重新启用,请删除或注释掉交换条目/etc/fstab

总结:

  1. 跑步sudo swapoff -a

  2. 使用命令检查是否有交换单位systemctl list-units --type=swap --state=active。如果有任何活动的交换单元,请使用以下命令屏蔽它们 systemctl mask <unit name>

  3. 删除交换条目/etc/fstab