kubernetes autoscaling不工作

Mik*_*ike 16 kubernetes

我有一个kubernetes集群,我试图从头开始构建而不使用他们的构建脚本.除了自动缩放之外,几乎所有东西都在工作.由于某种原因,控制管理器无法找到或不知道heapster正在运行.

我打开了一张票,但似乎没有回复

https://github.com/kubernetes/kubernetes/issues/18652

我设置的东西.

这是目前所有pod的列表

[root@kube-master test] [dev] # kubectl get pods --all-namespaces
NAMESPACE        NAME                                  READY     STATUS    RESTARTS   AGE
default          my-nginx-8kmlz                        1/1       Running   0          11h
default          my-nginx-z8cxb                        1/1       Running   0          11h
kube-system      heapster-v10-vdc1v                    3/3       Running   0          11h
kube-system      kube-apiserver-10.122.0.20            1/1       Running   0          4d
kube-system      kube-controller-manager-10.122.0.20   1/1       Running   1          9h
kube-system      kube-dns-6iw3a                        4/4       Running   0          4d
kube-system      kube-proxy-10.122.0.20                1/1       Running   0          3d
kube-system      kube-proxy-10.122.42.163              1/1       Running   0          4d
kube-system      kube-proxy-10.122.43.138              1/1       Running   1          4d
kube-system      kube-scheduler-10.122.0.20            1/1       Running   1          4d
Run Code Online (Sandbox Code Playgroud)

所以heapster正在运行我的代理,我可以访问

http://10.122.0.20:8080/api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/default/pods/my-nginx-8kmlz/stats
Run Code Online (Sandbox Code Playgroud)

它返回有关pod的统计信息.

我真的不确定我错过了什么.

以下是自动缩放的输出结果

[root@kube-master test] [dev] # kubectl get hpa
NAME       REFERENCE                              TARGET    CURRENT     MINPODS   MAXPODS   AGE
my-nginx   ReplicationController/my-nginx/scale   80%       <waiting>   1         5         22h
Run Code Online (Sandbox Code Playgroud)

在我的控制器日志中,我唯一真正看到的是

W1224 18:27:43.425126       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
Run Code Online (Sandbox Code Playgroud)

小智 14

如果您的pod上缺少CPU请求,则可能会导致此错误.您可以通过运行以下命令来确认:

kubectl get dc $YOUR_DC -o yaml
Run Code Online (Sandbox Code Playgroud)

要使用CPU自动缩放,您需要在资源部分下为pod规范指定CPU请求(CPU自动缩放基于所请求CPU的百分比).例如:

...
    spec:
      containers:
      - image: nginx
        name: nginx
        resources:
          requests:
            cpu: 400m
...
Run Code Online (Sandbox Code Playgroud)


小智 -1

有时会发生这种情况,因为资源指标未启用。

您可以使用命令验证blow

kubectl top pod -n <namespace>
Run Code Online (Sandbox Code Playgroud)

如果您正在获取 Pod,则启用指标:

https://kubernetes.io/docs/tasks/debug-application-cluster/resource-usage-monitoring/#resource-metrics-pipeline