bal*_*ach 3 logging grafana kubernetes grafana-loki promtail
Promtail、Grafana、Loki 版本是 2.4.1。运行的是 Kubernetes。
我正在关注文档。
我期望错误堆栈跟踪将出现在 grafana/loki 中的单个条目中,但每一行都是一个单独的条目。我是否缺少一些配置?
# cat /etc/promtail/promtail.yaml
server:
log_level: info
http_listen_port: 3101
client:
url: http://***-loki:3100/loki/api/v1/push
positions:
filename: /run/promtail/positions.yaml
scrape_configs:
# See also https://github.com/grafana/loki/blob/master/production/ksonnet/promtail/scrape_config.libsonnet for reference
- job_name: kubernetes-pods
pipeline_stages:
- multiline:
firstline: ^\x{200B}\[
max_lines: 128
max_wait_time: 3s
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
- __meta_kubernetes_pod_controller_name
regex: ([0-9a-z-.]+?)(-[0-9a-f]{8,10})?
action: replace
target_label: __tmp_controller_name
- source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_name
- __meta_kubernetes_pod_label_app
- __tmp_controller_name
- __meta_kubernetes_pod_name
regex: ^;*([^;]+)(;.*)?$
action: replace
target_label: app
- source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_component
- __meta_kubernetes_pod_label_component
regex: ^;*([^;]+)(;.*)?$
action: replace
target_label: component
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node_name
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
replacement: $1
separator: /
source_labels:
- namespace
- app
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: replace
replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- action: replace
regex: true/(.*)
replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
- __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
- __meta_kubernetes_pod_container_name
target_label: __path__
Run Code Online (Sandbox Code Playgroud)
事实证明,这些日志看起来与我们在镜头盒日志中看到的不同,或者kubectl logs {pod}。
promtail消耗的原始日志可以在宿主机上找到:
\nminikube ssh\ncat /var/log/pods/{namespace}_{pod}/{container}/0.log\nRun Code Online (Sandbox Code Playgroud)\n它们看起来像这样:
\n{"log":"\xe2\x80\x8b[default-nioEventLoopGroup-1-1] INFO HTTP_ACCESS_LOGGER - \\"GET /health/readiness HTTP/1.1\\" 200 523\\n","stream":"stdout","time":"2021-12-17T12:26:29.702621198Z"}\nRun Code Online (Sandbox Code Playgroud)\n因此,第一行正则表达式与任何日志行都不匹配。不幸的是,promtail 日志中没有关于此的错误。
\n这是 docker 日志格式,有一个管道阶段来解析它:
\n- docker: {}\nRun Code Online (Sandbox Code Playgroud)\n此外,日志中存在问题。多行堆栈跟踪中有额外的换行符,因此这个额外的管道阶段会过滤它们:
\n- replace:\n expression: '(\\n)'\n replace: ''\nRun Code Online (Sandbox Code Playgroud)\n所以我的工作配置如下所示:
\nserver:\n log_level: info\n http_listen_port: 3101\n\nclient:\n url: http://***-loki:3100/loki/api/v1/push\n \n\npositions:\n filename: /run/promtail/positions.yaml\n\nscrape_configs:\n # See also https://github.com/grafana/loki/blob/master/production/ksonnet/promtail/scrape_config.libsonnet for reference\n - job_name: kubernetes-pods\n pipeline_stages:\n - docker: {}\n - multiline:\n firstline: ^\\x{200B}\\[\n max_lines: 128\n max_wait_time: 3s\n - replace:\n expression: (\\n)\n replace: ""\n\n#config continues below (not copied)\nRun Code Online (Sandbox Code Playgroud)\n