Bil*_*Boy 4 python arrays numpy
我有一个非常大(长度约1.5亿)的numpy数组,其非零值非常少(约99.9%的数组为0).我想要洗牌,但是洗牌很慢(大约需要10秒,这是不可接受的,因为我正在进行蒙特卡罗模拟).有没有办法以考虑到我的数组主要由0组成的事实来改变它?
我正在考虑改变我的正值,然后将它随机插入一个完整的数组0,但我找不到一个numpy函数.
方法#1:这是一种方法 -
def shuffle_sparse_arr(a):
out = np.zeros_like(a)
mask = a!=0
n = np.count_nonzero(mask)
idx = np.random.choice(a.size, n, replace=0)
out[idx] = a[mask]
return out
Run Code Online (Sandbox Code Playgroud)
方法#2: Hackish方式 -
def shuffle_sparse_arr_hackish(a):
out = np.zeros_like(a)
mask = a!=0
n = np.count_nonzero(mask)
idx = np.unique((a.size*np.random.rand(2*n)).astype(int))[:n]
while idx.size<n:
idx = np.unique((a.size*np.random.rand(2*n)).astype(int))[:n]
np.random.shuffle(idx)
out[idx] = a[mask]
return out
Run Code Online (Sandbox Code Playgroud)
样品运行 -
In [269]: # Setup input array
...: a = np.zeros((20),dtype=int)
...: sidx = np.random.choice(a.size, 6, replace=0)
...: a[sidx] = [5,8,4,1,7,3]
...:
In [270]: a
Out[270]: array([4, 0, 0, 8, 0, 0, 5, 0, 0, 0, 0, 7, 0, 0, 1, 0, 0, 0, 0, 3])
In [271]: shuffle_sparse_arr(a)
Out[271]: array([0, 5, 0, 0, 0, 0, 1, 0, 4, 0, 0, 0, 0, 0, 0, 7, 3, 8, 0, 0])
In [272]: shuffle_sparse_arr_hackish(a)
Out[272]: array([3, 1, 5, 0, 4, 0, 7, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Run Code Online (Sandbox Code Playgroud)
运行时测试 -
In [288]: # Setup input array with 15 million and 99.9% zeros
...: a = np.zeros((15000000),dtype=int)
...:
...: # Set 100-99.9% as random non-zeros
...: n = int(a.size*((100-99.9)/100))
...:
...: set_idx = np.random.choice(a.size, n , replace=0)
...: nums = np.random.choice(a.size, n , replace=0)
...: a[set_idx] = nums
...:
In [289]: %timeit shuffle_sparse_arr(a)
1 loops, best of 3: 647 ms per loop
In [290]: %timeit shuffle_sparse_arr_hackish(a)
10 loops, best of 3: 29.1 ms per loop
In [291]: %timeit np.random.shuffle(a)
1 loops, best of 3: 606 ms per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
114 次 |
| 最近记录: |