如何迭代这个n维数据集?

poo*_*kie 7 python iteration multidimensional-array

我有一个dataset有4个维度(现在......),我需要迭代它.

要访问中的值dataset,我这样做:

value = dataset[i,j,k,l]
Run Code Online (Sandbox Code Playgroud)

现在,我可以得到shapedataset:

shape = [4,5,2,6]
Run Code Online (Sandbox Code Playgroud)

值in shape表示维度的长度.

考虑到维度的数量,我可以如何迭代数据集中的所有元素?这是一个例子:

for i in range(shape[0]):
    for j in range(shape[1]):
        for k in range(shape[2]):
            for l in range(shape[3]):
                print('BOOM')
                value = dataset[i,j,k,l]
Run Code Online (Sandbox Code Playgroud)

在未来,shape可能会改变.因此,例如,shape可能有10个元素而不是当前的4个元素.

使用Python 3有一个很好的,干净的方法吗?

MSe*_*ert 7

您可以使用itertools.product迭代某些值的笛卡尔积 1(在本例中为索引):

import itertools
shape = [4,5,2,6]
for idx in itertools.product(*[range(s) for s in shape]):
    value = dataset[idx]
    print(idx, value)
    # i would be "idx[0]", j "idx[1]" and so on...
Run Code Online (Sandbox Code Playgroud)

但是,如果它是一个想要迭代的numpy数组,它可能更容易使用np.ndenumerate:

import numpy as np

arr = np.random.random([4,5,2,6])
for idx, value in np.ndenumerate(arr):
    print(idx, value)
    # i would be "idx[0]", j "idx[1]" and so on...
Run Code Online (Sandbox Code Playgroud)

1您要求澄清itertools.product(*[range(s) for s in shape])实际做了什么.所以我会更详细地解释一下.

例如,你有这个循环:

for i in range(10):
    for j in range(8):
        # do whatever
Run Code Online (Sandbox Code Playgroud)

这也可以使用productas 编写:

for i, j in itertools.product(range(10), range(8)):
#                                        ^^^^^^^^---- the inner for loop
#                             ^^^^^^^^^-------------- the outer for loop
    # do whatever
Run Code Online (Sandbox Code Playgroud)

这意味着product只是减少独立 for循环次数的一种方便方法.

如果要将可变数量的for-loops 转换为a product,则基本上需要两个步骤:

# Create the "values" each for-loop iterates over
loopover = [range(s) for s in shape]

# Unpack the list using "*" operator because "product" needs them as 
# different positional arguments:
prod = itertools.product(*loopover)

for idx in prod:
     i_0, i_1, ..., i_n = idx   # index is a tuple that can be unpacked if you know the number of values.
                                # The "..." has to be replaced with the variables in real code!
     # do whatever
Run Code Online (Sandbox Code Playgroud)

这相当于:

for i_1 in range(shape[0]):
    for i_2 in range(shape[1]):
        ... # more loops
            for i_n in range(shape[n]):  # n is the length of the "shape" object
                # do whatever
Run Code Online (Sandbox Code Playgroud)