aks报告“豆荚不足”

Question

aks报告“豆荚不足”

我已经阅读了此处介绍的Azure Cats＆Dogs教程，在AKS中启动应用程序的最后一步中遇到了错误。Kubernetes报告说我的豆荚不够用，但是我不确定为什么会这样。几周前，我已经完成了同样的教程，没有任何问题。

$ kubectl apply -f azure-vote-all-in-one-redis.yaml
deployment.apps/azure-vote-back created
service/azure-vote-back created
deployment.apps/azure-vote-front created
service/azure-vote-front created

$ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
azure-vote-back-655476c7f7-mntrt    0/1     Pending   0          6s
azure-vote-front-7c7d7f6778-mvflj   0/1     Pending   0          6s

$ kubectl get events
LAST SEEN   TYPE      REASON                 KIND         MESSAGE
3m36s       Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
84s         Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
70s         Warning   FailedScheduling       Pod          skip schedule deleting pod: default/azure-vote-back-655476c7f7-l5j28
9s          Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
53m         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-back-655476c7f7-kjld6
99s         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-back-655476c7f7-l5j28
24s         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-back-655476c7f7-mntrt
53m         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-back-655476c7f7 to 1
99s         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-back-655476c7f7 to 1
24s         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-back-655476c7f7 to 1
9s          Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
3m36s       Warning   FailedScheduling       Pod          0/1 nodes are available: 1 Insufficient pods.
53m         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-front-7c7d7f6778-rmbqb
24s         Normal    SuccessfulCreate       ReplicaSet   Created pod: azure-vote-front-7c7d7f6778-mvflj
53m         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-front-7c7d7f6778 to 1
53m         Normal    EnsuringLoadBalancer   Service      Ensuring load balancer
52m         Normal    EnsuredLoadBalancer    Service      Ensured load balancer
46s         Normal    DeletingLoadBalancer   Service      Deleting load balancer
24s         Normal    ScalingReplicaSet      Deployment   Scaled up replica set azure-vote-front-7c7d7f6778 to 1

$ kubectl get nodes
NAME                       STATUS   ROLES   AGE    VERSION
aks-nodepool1-27217108-0   Ready    agent   7d4h   v1.9.9

Run Code Online (Sandbox Code Playgroud)

我唯一能想到的改变是我现在也正在运行其他（较大）集群，而我再次阅读此Cats＆Dogs教程的主要原因是因为我今天在其他集群中遇到了同样的问题。我的Azure帐户是否存在资源限制问题？

更新10-20 / 3：15 PST：请注意，尽管这三个群集是在不同的资源组中创建的，但它们如何显示它们都使用相同的节点池。另请注意，gem2-cluster的“ get-credentials”调用如何报告错误。我确实有一个名为gem2-cluster的集群，该集群使用相同的名称删除并重新创建了（实际上，我删除了wole资源组）。正确的做法是什么？

$ az aks get-credentials --name gem1-cluster --resource-group gem1-rg
Merged "gem1-cluster" as current context in /home/psteele/.kube/config

$ kubectl get nodes -n gem1
NAME                       STATUS   ROLES   AGE     VERSION
aks-nodepool1-27217108-0   Ready    agent   3h26m   v1.9.11

$ az aks get-credentials --name gem2-cluster --resource-group gem2-rg
A different object named gem2-cluster already exists in clusters

$ az aks get-credentials --name gem3-cluster --resource-group gem3-rg
Merged "gem3-cluster" as current context in /home/psteele/.kube/config

$ kubectl get nodes -n gem1
NAME                       STATUS   ROLES   AGE   VERSION
aks-nodepool1-14202150-0   Ready    agent   26m   v1.9.11

$ kubectl get nodes -n gem2
NAME                       STATUS   ROLES   AGE   VERSION
aks-nodepool1-14202150-0   Ready    agent   26m   v1.9.11

$ kubectl get nodes -n gem3
NAME                       STATUS   ROLES   AGE   VERSION
aks-nodepool1-14202150-0   Ready    agent   26m   v1.9.11

Run Code Online (Sandbox Code Playgroud)

Answer 1

Lip*_*sum 13

您的max-pods设置为什么？当您达到每个节点的窗格数限制时，这是一个正常错误。

您可以使用以下命令检查当前每个节点的最大Pod数：

$ kubectl get nodes -o yaml | grep pods
  pods: "30"
  pods: "30"

Run Code Online (Sandbox Code Playgroud)

而您目前拥有：

$ kubectl get pods --all-namespaces | grep Running | wc -l
  18

Run Code Online (Sandbox Code Playgroud)

不用`grep Running`命令就值得检查，因为这个命令让我失望了。由于cron作业映像拉出故障，我大约有700个吊舱处于挂起状态。谢谢。 (2认同)

Answer 2

Act*_*ack 7

我击中这个是因为我超过了最大 pods，我发现我可以通过以下方式处理多少：

$ kubectl get nodes -o json | jq -r .items[].status.allocatable.pods | paste -sd+ - | bc

Run Code Online (Sandbox Code Playgroud)

kubectl get Nodes -o json 并定位 items[].status.allocatable.pods 有效。它是 4 个，并且 4 个 pod 已经在系统命名空间中运行。 (2认同)

Answer 3

Ken*_*SFT 0

检查以确保您没有达到订阅的核心限制。

az vm list-usage --location "<location>" -o table

Run Code Online (Sandbox Code Playgroud)

如果您可以请求更多配额，https://learn.microsoft.com/en-us/azure/azure-supportability/resource-manager-core-quotas-request

归档时间：	7 年，1 月前
查看次数：	4154 次
最近记录：	6 年，4 月前