在哪里可以获取节点导出器指标描述？

Question

在哪里可以获取节点导出器指标描述？

JAE*_*ARK 11 linux monitoring kubernetes prometheus

我刚开始使用 prometheus、node exporter 等监控 k8s 集群。

我想知道这些指标的确切含义是什么，尽管指标的名称是自我描述的。

我已经检查了节点导出器的github，但我没有得到有用的信息。

在哪里可以获得节点导出器指标的描述？

谢谢

Answer 1

ane*_*yte 11

每个指标都有一个简短的描述。如果您在浏览器中打开节点导出器或仅打开curl http://my-node-exporter:9100/metrics. 您将看到所有导出的指标和带有# HELP描述的行：

# HELP node_cpu_seconds_total Seconds the cpus spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 2.59840376e+07

Run Code Online (Sandbox Code Playgroud)

Grafana 可以在编辑器中显示此帮助消息： Prometheus（使用最近的实验编辑器）也可以显示它：这适用于所有指标，而不仅仅是节点导出器的指标。如果您需要有关这些值的更多技术细节，我建议您在 Google 和页面中搜索信息man（如果您使用的是 Linux）。节点导出器几乎按原样获取大部分指标/proc，并且不难找到详细信息。举个例子node_memory_KReclaimable_bytes。“字节”后缀显然是单位，node_memory只是一个命名空间前缀，并且KReclaimable是实际的指标名称。使用man -K KReclaimable将带您进入proc(5)手册页，您可以在其中找到：

           KReclaimable %lu (since Linux 4.20)
                 Kernel allocations that the kernel will attempt to
                 reclaim under memory pressure.  Includes
                 SReclaimable (below), and other direct allocations
                 with a shrinker.

Run Code Online (Sandbox Code Playgroud)

最后，如果想要了解有关指标的更多信息是出于为硬件配置警报的愿望，您可以跳到最后一部分并从此处获取社区共享的一些警报：https://awesome-prometheus-alerts .grep.to/rules#主机和硬件

归档时间：	4 年，2 月前
查看次数：	8493 次
最近记录：	3 年，4 月前