从列表中随机选择50个项目以写入文件

O.r*_*rka 116 python random select file list

到目前为止,我已经想出了如何导入文件,创建新文件以及随机化列表.

我无法从列表中随机选择50个项目来写入文件?

def randomizer(input,output1='random_1.txt',output2='random_2.txt',output3='random_3.txt',output4='random_total.txt'):

#Input file 
    query=open(input,'r').read().split()
    dir,file=os.path.split(input)

    temp1 = os.path.join(dir,output1)
    temp2 = os.path.join(dir,output2)
    temp3 = os.path.join(dir,output3)
    temp4 = os.path.join(dir,output4)


    out_file4=open(temp4,'w')

    random.shuffle(query)

    for item in query:
        out_file4.write(item+'\n')   
Run Code Online (Sandbox Code Playgroud)

所以,如果总随机化文件是

example:

random_total = ['9','2','3','1','5','6','8','7','0','4']
Run Code Online (Sandbox Code Playgroud)

我想要3个文件(out_file1 | 2 | 3),第一个随机集3,第二个随机集3和第三个随机集3(对于这个例子,但我要创建的那个应该有50个)

random_1 = ['9','2','3']
random_2 = ['1','5','6']
random_3 = ['8','7','0']
Run Code Online (Sandbox Code Playgroud)

所以最后的'4'将不包括在内,这很好.

如何从随机化的列表中选择50?

更好的是,如何从原始列表中随机选择50?

Joh*_*ooy 231

如果列表是随机顺序,您可以选择前50个.

否则,请使用

import random
random.sample(the_list, 50)
Run Code Online (Sandbox Code Playgroud)

random.sample 帮助文字:

sample(self, population, k) method of random.Random instance
    Chooses k unique random elements from a population sequence.

    Returns a new list containing elements from the population while
    leaving the original population unchanged.  The resulting list is
    in selection order so that all sub-slices will also be valid random
    samples.  This allows raffle winners (the sample) to be partitioned
    into grand prize and second place winners (the subslices).

    Members of the population need not be hashable or unique.  If the
    population contains repeats, then each occurrence is a possible
    selection in the sample.

    To choose a sample in a range of integers, use xrange as an argument.
    This is especially fast and space efficient for sampling from a
    large population:   sample(xrange(10000000), 60)
Run Code Online (Sandbox Code Playgroud)

  • 使其从索引列表 (range(len(list)) 中采样,然后从随机索引列表和原始列表中重建样本。 (2认同)

Man*_*ani 38

选择随机项的一种简单方法是随机播放然后切片.

import random
a = [1,2,3,4,5,6,7,8,9]
random.shuffle(a)
print a[:4] # prints 4 random variables
Run Code Online (Sandbox Code Playgroud)

  • 我用它来轻松地为机器学习项目创建测试/训练集.使用`random.choice(mylist,3)`不会创建两个不相交的集合. (6认同)

Moe*_* MH 27

我认为这random.choice()是一个更好的选择.

import numpy as np

mylist = [13,23,14,52,6,23]

np.random.choice(mylist, 3, replace=False)
Run Code Online (Sandbox Code Playgroud)

该函数返回列表中3个随机选择的值的数组

  • 这有可能重复列表项 (8认同)
  • 我认为你需要使用`random.choice(mylist,3,replace = False)`.使用`import numpy as np`和`np.random.choice(mylist,3,replace = False)也不那么容易混淆 (7认同)