如何绘制数值变量的直方图?

nob*_*bar 4 splunk

我想生成一个数字变量的简单直方图X

我很难找到一个明确的例子。

由于直方图的意义比美观更重要,因此我更愿意指定 bin 大小而不是让工具来决定。请参阅:数据科学家:停止随机分箱直方图

nob*_*bar 7

Histograms are a primary tool for understanding the distribution of data. As such, Splunk automatically creates a histogram by default for raw event queries. So it stands to reason that Splunk should provide tools for you to create histograms of your own variables extracted from query results.

It may be that the reason this is hard to find is that the basic answer is very simple:

(your query) |rename (your value) as X
|chart count by X span=1.0
Run Code Online (Sandbox Code Playgroud)

Select "Visualization" and set chart type to "Column Chart" for a traditional vertical-bar histogram.

There is an example of this in the docs described as "Chart the number of transactions by duration".

The span value is used to control binning of the data. Adjust this value to optimize your visualization.

Warning: It is legal to omit span, but if you do so the X-axis will be compacted non-linearly to eliminate empty bins -- this could result in confusion if you aren't careful about observing the bin labels (assuming they're even drawn).


If you have a long-tail distribution, it may be useful to partition the results to focus on the range of interest. This can be done using where:

(your query) |rename (your value) as X
|where X>=0 and X<=100
|chart count by X span=1.0
Run Code Online (Sandbox Code Playgroud)

Alternatively, use a clamping function to preserve the out-of-range counts:

(your query) |rename (your value) as X
|eval X=max(0,min(X,100))
|chart count by X span=1.0
Run Code Online (Sandbox Code Playgroud)

Another way to deal with long-tails is to use a logarithmic span mode -- special values for span include log2 and log10 (documented as log-span).


If you would like to have both a non-default span and a compressed X-axis, there's probably a parameter for that -- but the documentation is cryptic.

I found that this 2-stage approach made that happen:

(your query) |rename (your value) as X
|bin X span=10.0 as X
|chart count by X
Run Code Online (Sandbox Code Playgroud)

Again, this type of chart can be dangerously misleading if you don't pay careful attention to the labels.