kar*_*avi 2 elasticsearch kubernetes efk azure-aks
我正在尝试在Kubernetes上设置EFK堆栈。使用的Elasticsearch版本为6.3.2。一切正常,直到我将探针配置放入部署YAML文件中。我收到如下错误。这导致吊舱被声明为不正常运行,并最终被重新启动,这似乎是错误的重新启动。
警告不健康的15s kubelet,aks-agentpool-23337112-0活动探针失败:获取http://10.XXX.Y.ZZZ:9200 / _cluster / health:拨打tcp 10.XXX.Y.ZZZ:9200:connect:connection被拒绝
我确实尝试过使用telnet从另一个容器到具有IP和端口的Elasticsearch Pod,但我成功了,但只有节点上的kubelet无法解析Pod的IP,导致探测失败。
以下是Kubernetes Statefulset YAML的pod规范的摘录。对决议的任何帮助将非常有帮助。花了很多时间对此一无所知:(
PS:在AKS群集上正在设置堆栈
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.3.2
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: myesdb
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: NODE_DATA
value: "true"
- name: DISCOVERY_SERVICE
value: "elasticsearch-discovery"
- name: NETWORK_HOST
value: "_eth0:ipv4_"
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
requests:
cpu: 0.25
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
livenessProbe:
httpGet:
port: http
path: /_cluster/health
initialDelaySeconds: 40
periodSeconds: 10
readinessProbe:
httpGet:
path: /_cluster/health
port: http
initialDelaySeconds: 30
timeoutSeconds: 10
Run Code Online (Sandbox Code Playgroud)
如果没有放置探针,则豆荚/容器运行良好。可以预期的是,在部署YAML上进行设置时,探针应能正常工作,并且POD不应重新启动。
问题是 ElasticSearch 本身有自己的健康状态(红色、黄色、绿色),您需要在配置中考虑这一点。
这是我在自己的 ES 配置中发现的,基于官方 ES helm 图表:
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 40
periodSeconds: 10
successThreshold: 3
timeoutSeconds: 5
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# If the node is starting up wait for the cluster to be green
# Once it has started only check that the node itself is responding
START_FILE=/tmp/.es_start_file
http () {
local path="${1}"
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
curl -XGET -s -k --fail ${BASIC_AUTH} http://127.0.0.1:9200${path}
}
if [ -f "${START_FILE}" ]; then
echo 'Elasticsearch is already running, lets check the node is healthy'
http "/"
else
echo 'Waiting for elasticsearch cluster to become green'
if http "/_cluster/health?wait_for_status=green&timeout=1s" ; then
touch ${START_FILE}
exit 0
else
echo 'Cluster is not yet green'
exit 1
fi
fi
Run Code Online (Sandbox Code Playgroud)
首先,请使用以下命令检查日志
kubectl logs <pod name> -n <namespacename>
Run Code Online (Sandbox Code Playgroud)
您必须首先运行init容器并更改卷权限。
您必须同时运行整个配置,user : 1000
然后再启动Elasticsearch容器之前,必须使用init容器更改卷许可权。
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
name: elasticsearch
spec:
podManagementPolicy: Parallel
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
serviceName: elasticsearch
template:
metadata:
creationTimestamp: null
labels:
app : elasticsearch
component: elasticsearch
release: elasticsearch
spec:
containers:
- env:
- name: cluster.name
value: <SET THIS>
- name: discovery.type
value: single-node
- name: ES_JAVA_OPTS
value: -Xms512m -Xmx512m
- name: bootstrap.memory_lock
value: "false"
image: elasticsearch:6.5.0
imagePullPolicy: IfNotPresent
name: elasticsearch
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
resources:
limits:
cpu: 250m
memory: 1Gi
requests:
cpu: 150m
memory: 512Mi
securityContext:
privileged: true
runAsUser: 1000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- chown -R 1000:1000 /usr/share/elasticsearch/data
- sysctl -w vm.max_map_count=262144
- chmod 777 /usr/share/elasticsearch/data
- chomod 777 /usr/share/elasticsearch/data/node
- chmod g+rwx /usr/share/elasticsearch/data
- chgrp 1000 /usr/share/elasticsearch/data
image: busybox:1.29.2
imagePullPolicy: IfNotPresent
name: set-dir-owner
resources: {}
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-data
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Run Code Online (Sandbox Code Playgroud)
查看我的yaml配置,即可使用。适用于Elasticsearch的单个节点
归档时间: |
|
查看次数: |
271 次 |
最近记录: |