Sam*_*ami 10 java statistics histogram bins apache-commons-math
我一直在寻找使用apache common math 3.0为特定数据集生成垃圾箱(通过指定低频段,高频段和所需的频段数).我看过频率http://commons.apache.org/math/apidocs/org/apache/commons/math3/stat/Frequency.html 但它没有给我我想要的东西..我想要一个给我的方法间隔中值的频率(例如:0到5之间有多少个值).有什么建议或想法吗?
Alt*_*852 15
这是使用Apache Commons Math 3实现直方图的简单方法:
final int BIN_COUNT = 20;
double[] data = {1.2, 0.2, 0.333, 1.4, 1.5, 1.2, 1.3, 10.4, 1, 2.0};
long[] histogram = new long[BIN_COUNT];
org.apache.commons.math3.random.EmpiricalDistribution distribution = new org.apache.commons.math3.random.EmpiricalDistribution(BIN_COUNT);
distribution.load(data);
int k = 0;
for(org.apache.commons.math3.stat.descriptive.SummaryStatistics stats: distribution.getBinStats())
{
histogram[k++] = stats.getN();
}
Run Code Online (Sandbox Code Playgroud)
据我所知,Apache Commons中没有好的直方图类.我最终写了自己的.如果你想要的是从最小到最大的线性分布箱,那么它很容易编写.
也许是这样的:
public static int[] calcHistogram(double[] data, double min, double max, int numBins) {
final int[] result = new int[numBins];
final double binSize = (max - min)/numBins;
for (double d : data) {
int bin = (int) ((d - min) / binSize);
if (bin < 0) { /* this data is smaller than min */ }
else if (bin >= numBins) { /* this data point is bigger than max */ }
else {
result[bin] += 1;
}
}
return result;
}
Run Code Online (Sandbox Code Playgroud)
编辑:这是一个例子.
double[] data = { 2, 4, 6, 7, 8, 9 };
int[] histogram = calcHistogram(data, 0, 10, 4);
// This is a histogram with 4 bins, 0-2.5, 2.5-5, 5-7.5, 7.5-10.
assert histogram[0] == 1; // one point (2) in range 0-2.5
assert histogram[1] == 1; // one point (4) in range 2.5-5.
// etc..
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
12185 次 |
最近记录: |