平均数据箱中的数据

Ada*_*dam 5 python numpy average scientific-computing python-3.x

我有两个列表:1是深度列表,另一个是叶绿素列表,它们彼此对应.我希望每0.5米深度平均一次叶绿素数据.

chl  = [0.4,0.1,0.04,0.05,0.4,0.2,0.6,0.09,0.23,0.43,0.65,0.22,0.12,0.2,0.33]
depth = [0.1,0.3,0.31,0.44,0.49,1.1,1.145,1.33,1.49,1.53,1.67,1.79,1.87,2.1,2.3]
Run Code Online (Sandbox Code Playgroud)

深度箱的长度并不总是相等,并且不总是以0.0或0.5的间隔开始.叶绿素数据总是与深度数据协调.叶绿素平均值也不能按升序排列,需要根据深度保持正确的顺序.深度和叶绿素列表很长,所以我不能单独这样做.

我如何制作0.5米深度的垃圾箱,其中平均含有叶绿素数据?

目标:

depth = [0.5,1.0,1.5,2.0,2.5]
chlorophyll = [avg1,avg2,avg3,avg4,avg5]
Run Code Online (Sandbox Code Playgroud)

例如:

avg1 = np.mean(0.4,0.1,0.04,0.05,0.4)
Run Code Online (Sandbox Code Playgroud)

mir*_*ulo 6

I'm surprised that scipy.stats.binned_statistic hasn't been mentioned yet. You can calculate the mean directly with it, and specify the bins with optional parameters.

from scipy.stats import binned_statistic

mean_stat = binned_statistic(depth, chl, 
                             statistic='mean', 
                             bins=5, 
                             range=(0, 2.5))

mean_stat.statistic
# array([0.198,   nan, 0.28 , 0.355, 0.265])
mean_stat.bin_edges
# array([0. , 0.5, 1. , 1.5, 2. , 2.5])
mean_stat.binnumber
# array([1, 1, 1, ..., 4, 5, 5])
Run Code Online (Sandbox Code Playgroud)


jpp*_*jpp 3

一种方法是使用numpy.digitize对类别进行分类。

然后使用字典或列表理解来计算结果。

import numpy as np

chl  = np.array([0.4,0.1,0.04,0.05,0.4,0.2,0.6,0.09,0.23,0.43,0.65,0.22,0.12,0.2,0.33])
depth = np.array([0.1,0.3,0.31,0.44,0.49,1.1,1.145,1.33,1.49,1.53,1.67,1.79,1.87,2.1,2.3])

bins = np.array([0,0.5,1.0,1.5,2.0,2.5])

A = np.vstack((np.digitize(depth, bins), chl)).T

res = {bins[int(i)]: np.mean(A[A[:, 0] == i, 1]) for i in np.unique(A[:, 0])}

# {0.5: 0.198, 1.5: 0.28, 2.0: 0.355, 2.5: 0.265}
Run Code Online (Sandbox Code Playgroud)

或者对于您所追求的精确格式:

res_lst = [np.mean(A[A[:, 0] == i, 1]) for i in range(len(bins))]

# [nan, 0.198, nan, 0.28, 0.355, 0.265]
Run Code Online (Sandbox Code Playgroud)