从2D numpy数组中删除运行

Pet*_*ton 8 python arrays algorithm numpy

给定2D numpy数组:

00111100110111
01110011000110
00111110001000
01101101001110
Run Code Online (Sandbox Code Playgroud)

是否有替代运行的有效方式,1这是>= N长?

例如,如果 N=3

00222200110222
02220011000110
00222220001000
01101101002220
Run Code Online (Sandbox Code Playgroud)

实际上,2D数组是二进制的,我想用0替换1的运行,但为了清楚起见,我在上面的例子中用2替换它们.

Runnable示例:http://runnable.com/U6q0q-TFWzxVd_Uf/numpy-replace-runs-for-python

我目前使用的代码看起来有点hacky,我觉得可能有一些神奇的numpy方式:

更新:我知道我将示例更改为不处理极端情况的版本.这是一个小的实现错误(现已修复).如果有更快的方法,我更感兴趣.

import numpy as np
import time

def replace_runs(a, search, run_length, replace = 2):
  a_copy = a.copy() # Don't modify original
  for i, row in enumerate(a):
    runs = []
    current_run = []
    for j, val in enumerate(row):
      if val == search:
        current_run.append(j)
      else:
        if len(current_run) >= run_length or j == len(row) -1:
          runs.append(current_run)
        current_run = []

    if len(current_run) >= run_length or j == len(row) -1:
      runs.append(current_run)

    for run in runs:
      for col in run:
        a_copy[i][col] = replace

  return a_copy

arr = np.array([
  [0,0,1,1,1,1,0,0,1,1,0,1,1,1],
  [0,1,1,1,0,0,1,1,0,0,0,1,1,0],
  [0,0,1,1,1,1,1,0,0,0,1,0,0,0],
  [0,1,1,0,1,1,0,1,0,0,1,1,1,0],
  [1,1,1,1,1,1,1,1,1,1,1,1,1,1],
  [0,0,0,0,0,0,0,0,0,0,0,0,0,0],
  [1,1,1,1,1,1,1,1,1,1,1,1,1,0],
  [0,1,1,1,1,1,1,1,1,1,1,1,1,1],
])

print arr
print replace_runs(arr, 1, 3)

iterations = 100000

t0 = time.time()
for i in range(0,iterations):
  replace_runs(arr, 1, 3)
t1 = time.time()

print "replace_runs: %d iterations took %.3fs" % (iterations, t1 - t0)
Run Code Online (Sandbox Code Playgroud)

输出:

[[0 0 1 1 1 1 0 0 1 1 0 1 1 1]
 [0 1 1 1 0 0 1 1 0 0 0 1 1 0]
 [0 0 1 1 1 1 1 0 0 0 1 0 0 0]
 [0 1 1 0 1 1 0 1 0 0 1 1 1 0]
 [1 1 1 1 1 1 1 1 1 1 1 1 1 1]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [1 1 1 1 1 1 1 1 1 1 1 1 1 0]
 [0 1 1 1 1 1 1 1 1 1 1 1 1 1]]

[[0 0 2 2 2 2 0 0 1 1 0 2 2 2]
 [0 2 2 2 0 0 1 1 0 0 0 2 2 0]
 [0 0 2 2 2 2 2 0 0 0 1 0 0 0]
 [0 1 1 0 1 1 0 1 0 0 2 2 2 0]
 [2 2 2 2 2 2 2 2 2 2 2 2 2 2]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [2 2 2 2 2 2 2 2 2 2 2 2 2 0]
 [0 2 2 2 2 2 2 2 2 2 2 2 2 2]]

replace_runs: 100000 iterations took 14.406s
Run Code Online (Sandbox Code Playgroud)

usu*_* me 0

这比 OP 稍快,但仍然很hacky:

\n\n
def replace2(originalM) :\n    m = originalM.copy()\n    for v in m :\n        idx = 0\n        for (key,n) in ( (key, sum(1 for _ in group)) for (key,group) in itertools.groupby(v) ) :\n            if key and n>=3 :\n                v[idx:idx+n] = 2\n            idx += n\n    return m\n\n%%timeit\nreplace_runs(arr, 1, 3)\n10000 loops, best of 3: 61.8 \xc2\xb5s per loop\n\n%%timeit\nreplace2(arr)\n10000 loops, best of 3: 48 \xc2\xb5s per loop\n
Run Code Online (Sandbox Code Playgroud)\n