就地插入列表(或数组)

sta*_*ane 2 python arrays indexing numpy list-comprehension

我在Python中运行一个脚本,我需要在某些索引位置将数字插入到数组(或列表)中.问题是,当我插入新数字时,索引位置无效.是否有一种巧妙的方法可以同时在索引位置插入新值?或者是我添加时增加索引号(该对的第一个值)的唯一解决方案?

示例测试代码段:

original_list = [0, 1, 2, 3, 4, 5, 6, 7]
insertion_indices = [1, 4, 5]
new_numbers = [8, 9, 10]
pairs = [(insertion_indices[i], new_numbers[i]) for i in range(len(insertion_indices))]

for pair in pairs:
    original_list.insert(pair[0], pair[1])
Run Code Online (Sandbox Code Playgroud)

结果是:

[0, 8, 1, 2, 9, 10, 3, 4, 5, 6, 7]
Run Code Online (Sandbox Code Playgroud)

而我想:

[0, 8, 1, 2, 3, 9, 4, 10, 5, 6, 7]
Run Code Online (Sandbox Code Playgroud)

Eci*_*ana 8

以反向顺序插入这些值.像这样:

original_list = [0, 1, 2, 3, 4, 5, 6, 7]
insertion_indices = [1, 4, 5]
new_numbers = [8, 9, 10]

new = zip(insertion_indices, new_numbers)
new.sort(reverse=True)

for i, x in new:
    original_list.insert(i, x)
Run Code Online (Sandbox Code Playgroud)

其工作原因基于以下观察:

list所有其他值的索引处插入一个值,将所有其他值的索引除以1.尽管在最后插入一个值,索引保持不变.因此,如果您首先插入具有最大索引(10)的值并继续"向后",则不必更新任何索引.


Div*_*kar 7

由于NumPy被标记,因为输入被称为列表/数组,您可以简单地使用内置numpy.insert-

np.insert(original_list, insertion_indices, new_numbers)
Run Code Online (Sandbox Code Playgroud)

为了将理论推广为定制的(主要用于性能),我们可以使用掩模,就像这样 -

def insert_numbers(original_list,insertion_indices, new_numbers):
    # Length of output array                               
    n = len(original_list)+len(insertion_indices)

    # Setup mask array to selecrt between new and old numbers
    mask = np.ones(n,dtype=bool)
    mask[insertion_indices+np.arange(len(insertion_indices))] = 0

    # Setup output array for assigning values from old and new lists/arrays
    # by using mask and inverted mask version
    out = np.empty(n,dtype=int)
    out[mask] = original_list
    out[~mask] = new_numbers
    return out
Run Code Online (Sandbox Code Playgroud)

对于列表输出,请附加.tolist().

样品运行 -

In [83]: original_list = [0, 1, 2, 3, 4, 5, 6, 7]
    ...: insertion_indices = [1, 4, 5]
    ...: new_numbers = [8, 9, 10]
    ...: 

In [85]: np.insert(original_list, insertion_indices, new_numbers)
Out[85]: array([ 0,  8,  1,  2,  3,  9,  4, 10,  5,  6,  7])

In [86]: np.insert(original_list, insertion_indices, new_numbers).tolist()
Out[86]: [0, 8, 1, 2, 3, 9, 4, 10, 5, 6, 7]
Run Code Online (Sandbox Code Playgroud)

10000x缩放数据集进行运行时测试 -

In [184]: original_list = range(70000)
     ...: insertion_indices = np.sort(np.random.choice(len(original_list), 30000, replace=0)).tolist()
     ...: new_numbers = np.random.randint(0,10, len(insertion_indices)).tolist()
     ...: out1 = np.insert(original_list, insertion_indices, new_numbers)
     ...: out2 = insert_numbers(original_list, insertion_indices, new_numbers)
     ...: print np.allclose(out1, out2)
True

In [185]: %timeit np.insert(original_list, insertion_indices, new_numbers)
100 loops, best of 3: 5.37 ms per loop

In [186]: %timeit insert_numbers(original_list, insertion_indices, new_numbers)
100 loops, best of 3: 4.8 ms per loop
Run Code Online (Sandbox Code Playgroud)

让我们用数组作为输入进行测试 -

In [190]: original_list = np.arange(70000)
     ...: insertion_indices = np.sort(np.random.choice(len(original_list), 30000, replace=0))
     ...: new_numbers = np.random.randint(0,10, len(insertion_indices))
     ...: out1 = np.insert(original_list, insertion_indices, new_numbers)
     ...: out2 = insert_numbers(original_list, insertion_indices, new_numbers)
     ...: print np.allclose(out1, out2)
True

In [191]: %timeit np.insert(original_list, insertion_indices, new_numbers)
1000 loops, best of 3: 1.48 ms per loop

In [192]: %timeit insert_numbers(original_list, insertion_indices, new_numbers)
1000 loops, best of 3: 1.07 ms per loop
Run Code Online (Sandbox Code Playgroud)

性能刚好开始,因为转换到列表时没有运行时开销.