kon*_*tin 1 python arrays numpy
我有一个包含我的值的 2D numpy 数组(其中一些可以是 NaN)。我想删除 30% 的非 NaN 值并将它们替换为数组的平均值。我怎么能这样做?到目前为止我尝试过的:
def spar_removal(array, mean_value, sparseness):
array1 = deepcopy(array)
array2 = array1
spar_size = int(round(array2.shape[0]*array2.shape[1]*sparseness))
for i in range (0, spar_size):
index = np.random.choice(np.where(array2 != mean_value)[1])
array2[0, index] = mean_value
return array2
Run Code Online (Sandbox Code Playgroud)
但这只是选择数组的同一行。如何从整个阵列中删除?似乎选择只适用于一维。我想我想要的是计算(x, y)我将用mean_value.
可能有更好的方法,但请考虑:
import numpy as np
x = np.array([[1,2,3,4],
[1,2,3,4],
[np.NaN, np.NaN, np.NaN, np.NaN],
[1,2,3,4]])
# Get a vector of 1-d indexed indexes of non NaN elements
indices = np.where(np.isfinite(x).ravel())[0]
# Shuffle the indices, select the first 30% (rounded down with int())
to_replace = np.random.permutation(indices)[:int(indices.size * 0.3)]
# Replace those indices with the mean (ignoring NaNs)
x[np.unravel_index(to_replace, x.shape)] = np.nanmean(x)
print(x)
Run Code Online (Sandbox Code Playgroud)
示例输出
[[ 2.5 2. 2.5 4. ] [ 1. 2. 3. 4. ] [楠楠楠楠] [ 2.5 2. 3. 4. ]]
NaN 永远不会改变,并且 floor(0.3 * 非 NaN 元素的数量) 将设置为平均值(忽略 NaN 的平均值)。