使numpy矩阵更稀疏

par*_*gin 2 python numpy recommendation-engine matrix

假设我有一个numpy数组

np.array([
    [3, 0, 5, 3, 0, 1],
    [0, 1, 2, 1, 5, 2],
    [4, 3, 5, 3, 1, 4],
    [2, 5, 2, 5, 3, 1],
    [0, 1, 2, 1, 5, 2],
])
Run Code Online (Sandbox Code Playgroud)

现在,我想用0随机替换一些元素.所以我有这样的输出

np.array([
    [3, 0, 0, 3, 0, 1],
    [0, 1, 2, 0, 5, 2],
    [0, 3, 0, 3, 1, 0],
    [2, 0, 2, 5, 0, 1],
    [0, 0, 2, 0, 5, 0],
])
Run Code Online (Sandbox Code Playgroud)

Div*_*kar 6

我们可以使用np.random.choice(..., replace=False)随机选择一些唯一的非零平坦索引,然后简单地索引和重置输入数组中的那些索引.

因此,一个解决方案是 -

def make_more_sparsey(a, n):
    # a is input array
    # n is number of non-zero elements to be reset to zero
    idx = np.flatnonzero(a) # for performance, use np.flatnonzero(a!=0)
    np.put(a, np.random.choice(idx, n, replace=False),0)
    return a
Run Code Online (Sandbox Code Playgroud)

样品运行 -

In [204]: R = np.array([
     ...:     [3, 0, 5, 3, 0, 1],
     ...:     [0, 1, 2, 1, 5, 2],
     ...:     [4, 3, 5, 3, 1, 4],
     ...:     [2, 5, 2, 5, 3, 1],
     ...:     [0, 1, 2, 1, 5, 2],
     ...: ])

In [205]: make_more_sparsey(R, n=5)
Out[205]: 
array([[3, 0, 5, 3, 0, 1],
       [0, 1, 0, 0, 5, 2],
       [4, 3, 5, 3, 1, 4],
       [2, 5, 0, 5, 3, 1],
       [0, 1, 0, 1, 0, 2]])
Run Code Online (Sandbox Code Playgroud)