选择无需替换 - 通过改变列表

Edm*_*mon 1 python algorithm sampling

我正在寻找Python中的高效函数,它可以在不替换的情况下进行样本选择,而是通过实际改变原始列表.也就是说,替代方案:

random.sample(population, k)
Run Code Online (Sandbox Code Playgroud)

在选择样本时从原始列表中删除元素.列表可以是数百万个项目,并且可能会对样本函数进行数十次后续调用.

理想情况下,我想做的事情如下:

sample_size_1 = 5   
sample_size_2 = 200   
sample_size_3 = 100   
population = range(10000000)  

sample_1 = select_sample(population, sample_size_1)  #population is shrunk  
sample_2 = select_sample(population, sample_size_2)  #population is shrunk again     
sample_3 = select_sample(population, sample_size_3)  #and population is shrunk again
Run Code Online (Sandbox Code Playgroud)

population每次调用select_sample之间有效缩小的位置.

我有一些代码,我可以在这里展示,但我希望已经可以获得的东西,或者比我的while循环更多的"pythonic".

Ffi*_*ydd 5

一种简单的方法是对人口进行洗牌,使初始排序是随机的(如果它不是随机的).然后从最后获取元素并删除它们.

您可以通过切片population[-sample_size:]并使用它们删除它们来获取元素population[-sample_size:] = [].

import random

population = list(range(100))

# Shuffle population so the ordering is random.
random.shuffle(population)

for sample_size in [1, 5, 10]:
    sample = population[-sample_size:]
    population[-sample_size:] = []
    print(sample)
    # [79]
    # [66, 89, 81, 0, 38]
    # [18, 39, 90, 36, 11, 32, 63, 65, 72, 67]
Run Code Online (Sandbox Code Playgroud)

population.pop()如果您只想一次删除一个元素(例如,如果sample_size为1),您也可以使用.

这样做的功能就是(假设您的人口已经洗牌):

def select_sample(pop, size):
    x = pop[-size:]
    pop[-size:] = []
    return x
Run Code Online (Sandbox Code Playgroud)