在数据集范围内计算积分的最有效方法

Question

在数据集范围内计算积分的最有效方法

我有一个 10 行 x 20 列的数组。每列对应一个数据集，该数据集无法拟合任何类型的连续数学函数（它是一系列通过实验得出的数字）。我想计算第 4 行和第 8 行之间每列的积分，然后将获得的结果存储在一个新数组中（20 行 x 1 列）。

我曾尝试使用不同的 scipy.integrate 模块（例如 quad、trpz 等）。

问题是，据我所知， scipy.integrate 必须应用于函数，我不确定如何将初始数组的每一列转换为函数。作为替代方案，我想计算第 4 行和第 8 行之间每列的平均值，然后将此数字乘以 4（即 8-4=4，x 间隔），然后将其存储到我的最终 20x1 数组中。问题是……嗯……我不知道如何计算给定范围内的平均值。我要问的问题是：

哪种方法更有效/直接？
可以在我所描述的数据集上计算积分吗？
如何计算一系列行的平均值？

Answer 1

pv.*_*pv. 5

由于您只知道数据点，因此最好的选择是使用trapz（积分的梯形近似，基于您知道的数据点）。

您很可能不想将您的数据集转换为函数，并且trapz您不需要这样做。

所以如果我理解正确的话，你想做这样的事情：

from numpy import *

# x-coordinates for data points
x = array([0, 0.4, 1.6, 1.9, 2, 4, 5, 9, 10])

# some random data: 3 whatever data sets (sharing the same x-coordinates)
y = zeros([len(x), 3])
y[:,0] = 123
y[:,1] = 1 + x
y[:,2] = cos(x/5.)
print y

# compute approximations for integral(dataset, x=0..10) for datasets i=0,1,2
yi = trapz(y, x[:,newaxis], axis=0)
# what happens here: x must be an array of the same shape as y
# newaxis tells numpy to add a new "virtual" axis to x, in effect saying that the
# x-coordinates are the same for each data set

# approximations of the integrals based the datasets
# (here we also know the exact values, so print them too)
print yi[0], 123*10
print yi[1], 10 + 10*10/2.
print yi[2], sin(10./5.)*5.

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，1 月前
查看次数：	4024 次
最近记录：	15 年，1 月前