成功扩展后,AWS AutoScaling Down 策略失败:无法执行自动扩展操作:未找到指标值的步骤调整

nar*_*ler 5 javascript amazon-web-services autoscaling amazon-cloudwatch terraform

在 Terraform v13 中创建自动缩放策略和 cloudwatch 警报资源时,它们创建得很好。然而,在对端点进行负载测试时,他们成功地扩展了实例,但当 CPU 利用率在一段时间内达到必要的百分比时,他们无法缩小实例。错误看起来像这样:

"historySummary": "无法执行 AutoScaling 操作:未找到指标值 [5.99763732496649、2.7634547331059975] 和违规增量 -4.00236267503351 的步长调整"

下面列出了地形资源:

自动缩放策略 -

resource "aws_appautoscaling_policy" "frontend_down" {
  name               = "${var.name}_frontend_scale_down"
  service_namespace  = "ecs"
  resource_id        = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.frontend.name}"
  scalable_dimension = "ecs:service:DesiredCount"

  step_scaling_policy_configuration {
    adjustment_type         = "ChangeInCapacity"
    cooldown                = 30
    metric_aggregation_type = "Maximum"

    step_adjustment {
      metric_interval_lower_bound = 0
      scaling_adjustment          = -1
    }
  }

  depends_on = [aws_appautoscaling_target.frontend_target]
}
Run Code Online (Sandbox Code Playgroud)

云表警报 -

resource "aws_cloudwatch_metric_alarm" "frontend_service_cpu_low" {
  alarm_name          = "${var.name}_cpu_utilization_low_fe"
  comparison_operator = "LessThanOrEqualToThreshold"
  evaluation_periods  = "2"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/ECS"
  period              = "60"
  statistic           = "Average"
  threshold           = "10"

  dimensions = {
    ClusterName = var.ecs_cluster_name
    ServiceName = var.ecs_service_name_frontend
  }

  alarm_actions = [var.autoscaling_down_arn_frontend]

  tags = {
    Name        = "${var.name}-autoscaling"
    BillingCode = var.billing_code_tag
    Environment = var.environment_tag
  }
}
Run Code Online (Sandbox Code Playgroud)

nar*_*ler 8

找出原因,这是因为在缩小策略时我使用“metric_interval_lower_bound”而不是“metric_interval_upper_bound”。当缩小时,与警报阈值和 Cloudwatch 指标相比,它会提供负增量,因此 0 成为上限。扩大规模时,您可以使用下限,因为它提供正增量