使用 outfor 和 while 循环索引这个列表的更快方法是什么?(Python)

Mcl*_*000 1 python optimization performance numpy

我试图找到一种方法来摆脱这个 while 循环,因为它的时间成本很高。我在这里做的是索引列表(数据),然后找到 [x:x+9] 之间的最高值,然后将其添加到另一个数组(结果),然后将 1 添加到 x 以索引整个列表. 这是一种愚蠢的做法吗?有没有更快更聪明的方法?任何帮助深表感谢。我希望我已经很好地解释了这一点。

def calc(data):
    result = np.zeros(len(data)) # allocating space
    x = 0
    while x < len(a):
        highest_value = max(data[x:x+9])
        print(f'{data[x:x+9]} highest value = {highest_value}')
        result[x] = b
        print(result)
        x += 1
    return result

data = [7,6,5,4,3,4,2,3,4,5,6,7,8,9,3,5,4,2,3,1]

result = calc(data)
Run Code Online (Sandbox Code Playgroud)

出去:




[7, 6, 5, 4, 3, 4, 2, 3, 4] highest value = 7
[7. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[6, 5, 4, 3, 4, 2, 3, 4, 5] highest value = 6
[7. 6. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[5, 4, 3, 4, 2, 3, 4, 5, 6] highest value = 6
[7. 6. 6. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[4, 3, 4, 2, 3, 4, 5, 6, 7] highest value = 7
[7. 6. 6. 7. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[3, 4, 2, 3, 4, 5, 6, 7, 8] highest value = 8
[7. 6. 6. 7. 8. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[4, 2, 3, 4, 5, 6, 7, 8, 9] highest value = 9
[7. 6. 6. 7. 8. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[2, 3, 4, 5, 6, 7, 8, 9, 3] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[3, 4, 5, 6, 7, 8, 9, 3, 5] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[4, 5, 6, 7, 8, 9, 3, 5, 4] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[5, 6, 7, 8, 9, 3, 5, 4, 2] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[6, 7, 8, 9, 3, 5, 4, 2, 3] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7, 8, 9, 3, 5, 4, 2, 3, 1] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0.]
[8, 9, 3, 5, 4, 2, 3, 1] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0.]
[9, 3, 5, 4, 2, 3, 1] highest value = 9
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0.]
[3, 5, 4, 2, 3, 1] highest value = 5
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 0. 0. 0. 0. 0.]
[5, 4, 2, 3, 1] highest value = 5
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 0. 0. 0. 0.]
[4, 2, 3, 1] highest value = 4
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 0. 0. 0.]
[2, 3, 1] highest value = 3
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 3. 0. 0.]
[3, 1] highest value = 3
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 3. 3. 0.]
[1] highest value = 1
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 3. 3. 1.]

______________________________________________________________
result: 

[7. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 0. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 0. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 0. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 0. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 3. 0. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 3. 3. 0.]
[7. 6. 6. 7. 8. 9. 9. 9. 9. 9. 9. 9. 9. 9. 5. 5. 4. 3. 3. 1.]

Run Code Online (Sandbox Code Playgroud)

挂墙时间:7.01 毫秒

Ala*_* T. 5

您可以通过为 numpy 提供每个子范围的索引列表来让 numpy 并行执行计算:

例如:

import numpy as np

data = np.array([7,6,5,4,3,4,2,3,4,5,6,7,8,9,3,5,4,2,3,1])

idx = np.arange(9)+np.arange(len(data))[:,None] # indexes of subRanges
idx = np.minimum(len(data)-1,idx)               # don't overflow indexes

rollingMax = np.max(data[idx],axis=1) # apply maximums on every subrange

print(rollingMax)
[7 6 6 7 8 9 9 9 9 9 9 9 9 9 5 5 4 3 3 1]
Run Code Online (Sandbox Code Playgroud)

[编辑] 一种更快的方法是遍历值偏移而不是位置。虽然这仍然涉及一个循环,但它要快得多,并且可以在更大的数据集上保持速度改进。

def rollingMax2(data,window=9):
    result = data.copy()
    for offset in range(1,window):
        result[:-1] = np.maximum(result[:-1],result[1:])
    return result

                         speed improvement
number of values     rollingMax   rollingMax2
           20             3x          3x
          200            15x         31x  
        2,000            26x         99x
       20,000            35x        167x
      200,000            21x        220x 
    2,000,000            11x         66x 
   20,000,000            11x         35x
Run Code Online (Sandbox Code Playgroud)