numpy数组在两个值之间设置,快速

Fin*_*ent 12 python arrays performance numpy

一直在寻找这个问题的解决方案一段时间但似乎找不到任何东西.

例如,我有一个numpy数组

[ 0,  0,  2,  3,  2,  4,  3,  4,  0,  0, -2, -1, -4, -2, -1, -3, -4,  0,  2,  3, -2, -1,  0]
Run Code Online (Sandbox Code Playgroud)

我想要实现的是生成另一个数组来指示一对数字之间的元素,比如说在2和-2之间.所以我想得到一个像这样的数组

[ 0,  0,  1,  1,  1,  1,  1,  1,  1,  1,  1,  0,  0,  0,  0,  0,  0,  0,  1,  1,  1,  0,  0]
Run Code Online (Sandbox Code Playgroud)

注意一对(2,-2)之间的任何2或-2都被忽略.任何简单的方法是使用for循环迭代每个元素并识别第一次出现的2并将之后的所有内容设置为1,直到你达到-2并再次开始寻找下一个2.

但我希望这个过程更快,因为我在一个numpy数组中有超过1000个元素.这个过程需要做很多次.你们知道解决这个问题的优雅方法吗?提前致谢!

Div*_*kar 4

这是一个很大的问题!这篇文章中列出的是一个矢量化解决方案(希望内嵌的注释将有助于解释其背后的逻辑)。我假设A输入数组为T1,T2作为开始和停止触发器。

def setones_between_triggers(A,T1,T2):    

    # Get start and stop indices corresponding to rising and falling triggers
    start = np.where(A==T1)[0]
    stop = np.where(A==T2)[0]

    # Take care of boundary conditions for np.searchsorted to work
    if (stop[-1] < start[-1]) & (start[-1] != A.size-1):
        stop = np.append(stop,A.size-1)

    # This is where the magic happens.
    # Validate (filter out) the triggers based on the set conditions :
    # 1. See if there are more than one stop indices between two start indices.
    # If so, use the first one and rejecting all others in that in-between space.
    # 2. Repeat the same check for start, but use the validated start indices.

    # First off, take care of out-of-bound cases for proper indexing
    stop_valid_idx = np.unique(np.searchsorted(stop,start,'right'))
    stop_valid_idx = stop_valid_idx[stop_valid_idx < stop.size]

    stop_valid = stop[stop_valid_idx]
    _,idx = np.unique(np.searchsorted(stop_valid,start,'left'),return_index=True)
    start_valid = start[idx]

    # Create shifts array (array filled with zeros, unless triggered by T1 and T2 
    # for which we have +1 and -1 as triggers). 
    shifts = np.zeros(A.size,dtype=int)
    shifts[start_valid] = 1
    shifts[stop_valid] = -1

    # Perform cumm. summation that would almost give us the desired output
    out = shifts.cumsum()

    # For a worst case when we have two groups of (T1,T2) adjacent to each other, 
    # set the negative trigger position as 1 as well
    out[stop_valid] = 1    
    return out
Run Code Online (Sandbox Code Playgroud)

样品运行

原始样例:

In [1589]: A
Out[1589]: 
array([ 0,  0,  2,  3,  2,  4,  3,  4,  0,  0, -2, -1, -4, -2, -1, -3, -4,
        0,  2,  3, -2, -1,  0])

In [1590]: setones_between_triggers(A,2,-2)
Out[1590]: array([0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0])
Run Code Online (Sandbox Code Playgroud)

最坏情况#1(相邻(2,-2)组):

In [1595]: A
Out[1595]: 
array([-2,  2,  0,  2, -2,  2,  2,  2,  4, -2,  0, -2, -2, -4, -2, -1,  2,
       -4,  0,  2,  3, -2, -2,  0])

In [1596]: setones_between_triggers(A,2,-2)
Out[1596]: 
array([0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0,
       0], dtype=int32)
Run Code Online (Sandbox Code Playgroud)

最坏情况#2(2没有任何-2结束):

In [1603]: A
Out[1603]: 
array([-2,  2,  0,  2, -2,  2,  2,  2,  4, -2,  0, -2, -2, -4, -2, -1, -2,
       -4,  0,  2,  3,  5,  6,  0])

In [1604]: setones_between_triggers(A,2,-2)
Out[1604]: 
array([0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1], dtype=int32)
Run Code Online (Sandbox Code Playgroud)