为什么 AKS 节点显示的可分配内存量较少，而其实际内存仍然可用

Question

为什么 AKS 节点显示的可分配内存量较少，而其实际内存仍然可用

我想知道 AKS 节点考虑保留内存的因素以及它如何计算可分配内存。

在我的集群中，我们有多个节点（2 个 CPU、7 GB RAM）。

我观察到所有节点（18+）仅显示 7 GB 中的 4 GB 可分配内存。因此，我们的集群具有用于新部署的资源连接。因此我们必须相应地增加节点数量以满足资源需求。

正如我在下面评论的那样进行了更新 ，添加了下面的 kubectl 顶部节点命令。这里很奇怪的是节点消耗百分比怎么会超过100%。

NAME                                CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
aks-nodepool1-xxxxxxxxx-vmssxxxx00   265m         13%    2429Mi          53%
aks-nodepool1-xxxxxxxxx-vmssxxxx01   239m         12%    3283Mi          71%
aks-nodepool1-xxxxxxxxx-vmssxxxx0g   465m         24%    4987Mi          109%
aks-nodepool2-xxxxxxxxx-vmssxxxx8i   64m          3%     3085Mi          67%
aks-nodepool2-xxxxxxxxx-vmssxxxx8p   114m         6%     5320Mi          116%
aks-nodepool2-xxxxxxxxx-vmssxxxx9n   105m         5%     2715Mi          59%
aks-nodepool2-xxxxxxxxx-vmssxxxxaa   134m         7%     5216Mi          114%
aks-nodepool2-xxxxxxxxx-vmssxxxxat   179m         9%     5498Mi          120%
aks-nodepool2-xxxxxxxxx-vmssxxxxaz   141m         7%     4769Mi          104%
aks-nodepool2-xxxxxxxxx-vmssxxxxb0   72m          3%     1972Mi          43%
aks-nodepool2-xxxxxxxxx-vmssxxxxb1   133m         7%     3684Mi          80%
aks-nodepool2-xxxxxxxxx-vmssxxxxb3   182m         9%     5294Mi          115%
aks-nodepool2-xxxxxxxxx-vmssxxxxb4   133m         7%     5009Mi          109%
aks-nodepool2-xxxxxxxxx-vmssxxxxbj   68m          3%     1783Mi          39%

Run Code Online (Sandbox Code Playgroud)

所以这里我以 aks-nodepool2-xxxxxxxxx-vmssxxxx8p 114m 6% 5320Mi 116% 节点为例

我计算了该节点中每个 Pod 的内存使用情况，总计约为 4.1 GB，节点可分配内存为 4.6 GB，实际内存为 7GB。

这里的“为什么顶部节点”输出与该节点中每个 Pod 的“顶部 Pod 输出”不同？

预期 % == 4.1GB/4.6 GB== 93% 但顶级节点命令给出的输出为 116%

Answer 1

小智 7

这是 AKS 的预期行为，以确保群集安全并正常运行。

当您在 AKS 中创建 k8s 集群时，并不意味着您将获得虚拟机拥有的所有内存/CPU。根据集群配置，它消耗的资源甚至可能比您共享的资源还要多。例如，如果您启用OMS代理来了解AKS，它也会保留一些容量。

从官方文档来看，Azure Kubernetes Service (AKS) 的 Kubernetes 核心概念 --> 资源预留。有关关联的最佳实践，请参阅AKS 中基本调度程序功能的最佳实践。

AKS uses node resources to help the node function as part of your cluster. This usage can create a discrepancy between your node's total resources and the allocatable resources in AKS. Remember this information when setting requests and limits for user deployed pods.

To find a node's allocatable resources, run:
kubectl describe node [NODE_NAME]

To maintain node performance and functionality, AKS reserves resources on each node. As a node grows larger in resources, the resource reservation grows due to a higher need for management of user-deployed pods.

Two types of resources are reserved:
- CPU
    Reserved CPU is dependent on node type and cluster configuration, which may cause less allocatable CPU due to running additional features.
- Memory
    Memory utilized by AKS includes the sum of two values.
    
    - kubelet daemon
    The kubelet daemon is installed on all Kubernetes agent nodes to manage container creation and termination.
    By default on AKS, kubelet daemon has the memory.available<750Mi eviction rule, ensuring a node must always have at least 750 Mi allocatable at all times. When a host is below that available memory threshold, the kubelet will trigger to terminate one of the running pods and free up memory on the host machine.
    
    - A regressive rate of memory reservations for the kubelet daemon to properly function (kube-reserved).
    25% of the first 4 GB of memory
    20% of the next 4 GB of memory (up to 8 GB)
    10% of the next 8 GB of memory (up to 16 GB)
    6% of the next 112 GB of memory (up to 128 GB)
    2% of any memory above 128 GB
    
Memory and CPU allocation rules:
- Keep agent nodes healthy, including some hosting system pods critical to cluster health.
- Cause the node to report less allocatable memory and CPU than it would if it were not part of a Kubernetes cluster.
The above resource reservations can't be changed.

For example, if a node offers 7 GB, it will report 34% of memory not allocatable including the 750Mi hard eviction threshold.

0.75 + (0.25*4) + (0.20*3) = 0.75GB + 1GB + 0.6GB = 2.35GB / 7GB = 33.57% reserved

In addition to reservations for Kubernetes itself, the underlying node OS also reserves an amount of CPU and memory resources to maintain OS functions.

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，7 月前
查看次数：	5228 次
最近记录：	2 年，8 月前