k8s 部署无法在 eks 集群上扩展太多 Pod

Eha*_*ban 1 kubernetes terraform amazon-eks

我还是个菜鸟,请对我温柔一点!

我有一个使用以下节点组配置运行的 eks 集群:

resource "aws_eks_node_group" "this" {
  cluster_name    = aws_eks_cluster.this.name
  node_group_name = local.cluster_name
  node_role_arn   = aws_iam_role.eks_node.arn
  subnet_ids      = aws_subnet.this.*.id
  instance_types  = ["t2.micro"]

  scaling_config {
    desired_size = 2
    max_size     = 4
    min_size     = 2
  }

  # Optional: Allow external changes without Terraform plan difference
  lifecycle {
    ignore_changes = [scaling_config[0].desired_size]
  }

  depends_on = [
    aws_iam_role_policy_attachment.eks_AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.eks_AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.eks_AmazonEC2ContainerRegistryReadOnly,
  ]
}
Run Code Online (Sandbox Code Playgroud)

我的缩放配置是:

  scaling_config {
    desired_size = 2
    max_size     = 4
    min_size     = 2
  }
Run Code Online (Sandbox Code Playgroud)

我可以2使用以下配置成功部署 nginx 副本:

resource "kubernetes_deployment" "nginx" {
  metadata {
    name = "nginx"
    labels = {
      App = "Nginx"
    }
  }

  spec {
    replicas = 2
    selector {
      match_labels = {
        App = "Nginx"
      }
    }
    template {
      metadata {
        labels = {
          App = "Nginx"
        }
      }
      spec {
        container {
          image = "nginx:1.7.8"
          name  = "nginx"

          port {
            container_port = 80
          }

          resources {
            limits = {
              cpu    = "0.5"
              memory = "512Mi"
            }
            requests = {
              cpu    = "250m"
              memory = "50Mi"
            }
          }
        }
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

但是当我将副本扩展到4Pod 时,它们已创建但处于待处理状态,原因如下:

Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  18s (x2 over 108s)  default-scheduler  0/2 nodes are available: 2 Too many pods.
Run Code Online (Sandbox Code Playgroud)

我尝试忽略desired_sizescaling_config无助于解决问题。

我相信我缺少对使用scaling_config 及其创建的扩展组和k8s 部署副本的重要理解。任何帮助我了解正在发生的事情的指导都将受到高度赞赏。非常感谢提前。

https://github.com/ehabshaaban/deploy-nginx/tree/eks

小智 7

根据消息0/2 nodes are available: 2 Too many pods.,可以发现该节点无法放置任何pod。在 EKS 中,节点中可以放置的最大 Pod 数量将基于多种因素instance type& cni。默认情况下,可以参考eni-max-pod这个文档

要解决您的问题,您可以将desired_size2 个增加到 3 个。这样 Pod 就会被放置到新节点上。