我有一个正在运行的 EKS 集群。它使用 ALB 进行入口。
当我应用服务然后进入时,其中大部分都会按预期工作。然而,一些目标群体最终没有登记目标。如果我获取服务 IP 地址kubectl describe svc my-service-name并在目标组中手动注册端点,则 Pod 可以再次访问,但这不是一个可持续的过程。
对可能发生的事情有什么想法吗?为什么 EKS 在 Pod 循环时找不到目标组?
每个服务(秘密、部署、服务和入口)都包含一组 .yaml 文件,应用如下:
deploy.sh
#!/bin/bash
set -e
kubectl apply -f ./secretsMap.yaml
kubectl apply -f ./configMap.yaml
kubectl apply -f ./deployment.yaml
kubectl apply -f ./service.yaml
kubectl apply -f ./ingress.yaml
Run Code Online (Sandbox Code Playgroud)
service.yaml
apiVersion: v1
kind: Service
metadata:
name: "site-bob"
namespace: "next-sites"
spec:
ports:
- port: 80
targetPort: 3000
protocol: TCP
type: NodePort
selector:
app: "site-bob"
Run Code Online (Sandbox Code Playgroud)
ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: "site-bob"
namespace: "next-sites"
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/tags: Environment=Production,Group=api
alb.ingress.kubernetes.io/backend-protocol: HTTP
alb.ingress.kubernetes.io/ip-address-type: ipv4
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80},{"HTTPS":443}]'
alb.ingress.kubernetes.io/load-balancer-name: eks-ingress-1
alb.ingress.kubernetes.io/group.name: eks-ingress-1
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-2:402995436123:certificate/9db9dce3-055d-4655-842e-xxxxx
alb.ingress.kubernetes.io/healthcheck-port: traffic-port
alb.ingress.kubernetes.io/healthcheck-path: /
alb.ingress.kubernetes.io/healthcheck-interval-seconds: '30'
alb.ingress.kubernetes.io/healthcheck-timeout-seconds: '16'
alb.ingress.kubernetes.io/success-codes: 200,201
alb.ingress.kubernetes.io/healthy-threshold-count: '2'
alb.ingress.kubernetes.io/unhealthy-threshold-count: '2'
alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=60
alb.ingress.kubernetes.io/target-group-attributes: deregistration_delay.timeout_seconds=30
alb.ingress.kubernetes.io/actions.ssl-redirect: >
{
"type": "redirect",
"redirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}
}
alb.ingress.kubernetes.io/actions.svc-host: >
{
"type":"forward",
"forwardConfig":{
"targetGroups":[
{
"serviceName":"site-bob",
"servicePort": 80,"weight":20}
],
"targetGroupStickinessConfig":{"enabled":true,"durationSeconds":200}
}
}
labels:
app: site-bob
spec:
rules:
- host: "staging-bob.imgeinc.net"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ssl-redirect
port:
name: use-annotation
- backend:
service:
name: svc-host
port:
name: use-annotation
pathType: ImplementationSpecific
Run Code Online (Sandbox Code Playgroud)
我的配置中添加了一些内容,将两个安全组标记为由集群拥有。当我检查负载平衡器控制器日志时:
kubectl logs -n kube-system aws-load-balancer-controller-677c7998bb-l7mwb
Run Code Online (Sandbox Code Playgroud)
我看到很多行,比如:
{"level":"error","ts":1641996465.6707578,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Reconciler error","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","name":"k8s-nextsite-sitefest-89a6f0ff0a","namespace":"next-sites","error":"expect exactly one securityGroup tagged with kubernetes.io/cluster/imageinc-next-eks-4KN4v6EX for eni eni-0c5555fb9a87e93ad, got: [sg-04b2754f1c85ac8b9 sg-07b026b037dd4d6a4]"}
Run Code Online (Sandbox Code Playgroud)
sg-07b026b037dd4d6a4描述:EKS 创建了应用于 ENI 的安全组,该 ENI 附加到 EKS 控制平面主节点以及任何托管工作负载。
sg-04b2754f1c85ac8b9具有描述:集群中所有节点的安全组。
我删除了标签:
{
Key: 'kubernetes.io/cluster/_cluster name_',
value:'owned'
}
Run Code Online (Sandbox Code Playgroud)
从sg-04b2754f1c85ac8b9
目标群体开始填充,现在一切正常。这两个组都是由 Terraform 创建并标记的。我怀疑我的工作组配置已关闭。
| 归档时间: |
|
| 查看次数: |
3866 次 |
| 最近记录: |