Prometheus - 如何从范围查询中过滤掉过时的指标？

Question

Prometheus - 如何从范围查询中过滤掉过时的指标？

dal*_*vik 6 monitoring prometheus promql

在我的 Prometheus 实例上，我设置了storage.tsdb.retention.sizeto128GiB和storage.tsdb.retention.timeto 0s，因此 Prometheus 会保留旧数据，直到达到 128 GB 限制。

现在，我有一些时间序列已经很长时间没有更新（即陈旧）。如果我对最近不再存在的过时指标进行范围查询，那么一切都很好。

例如，PromQL 查询：

> metric{label1="foo"}[1d]

Run Code Online (Sandbox Code Playgroud)

返回：

...
metric{label1="foo"}  <value>@<timestamp>  # <== OK, fresh time series

Run Code Online (Sandbox Code Playgroud)

但是，如果我在过时指标仍在更新时进行更远的范围查询，则该过时时间序列将包含在结果中。

例如，查询：

> metric{label1="foo"}[60d]

Run Code Online (Sandbox Code Playgroud)

返回：

...
metric{label1="foo"}               <value>@<timestamp>  # <== OK, newest timestamp right now
...
metric{label1="foo",label2="bar"}  <value>@<timestamp>  # <== !! newest timestamp one month ago!

Run Code Online (Sandbox Code Playgroud)

我不希望结果中包含第二个（陈旧）时间序列，我只希望第一个（新鲜）时间序列的数据达到 60 天前。

有没有办法用 PromQL 来实现这一点，即从范围查询中过滤掉过时的时间序列？

Answer 1

mar*_*lex 0

您可以通过添加类似的内容来过滤掉当前不存在的时间序列

and mymetric@end()

Run Code Online (Sandbox Code Playgroud)

在查询结束时（尊重从查询中删除的标签）。

例如你可以这样做：

max_over_time(ALERTS[60m])and ALERTS@end()

Run Code Online (Sandbox Code Playgroud)

或这个：

count by(instance) (max_over_time(ALERTS[60m])) and on(instance) ALERTS@end()

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，2 月前
查看次数：	1163 次
最近记录：	2 年，11 月前