Col*_*lin 209 python optimization
我需要编写random.choice的加权版本(列表中的每个元素都有不同的被选中概率).这就是我想出的:
def weightedChoice(choices):
"""Like random.choice, but each element can have a different chance of
being selected.
choices can be any iterable containing iterables with two items each.
Technically, they can have more than two items, the rest will just be
ignored. The first item is the thing being chosen, the second item is
its weight. The weights can be any numeric values, what matters is the
relative differences between them.
"""
space = {}
current = 0
for choice, weight in choices:
if weight > 0:
space[current] = choice
current += weight
rand = random.uniform(0, current)
for key in sorted(space.keys() + [current]):
if rand < key:
return choice
choice = space[key]
return None
Run Code Online (Sandbox Code Playgroud)
这个功能对我来说似乎过于复杂,也很难看.我希望这里的每个人都可以提出改进建议或其他方法.效率对我来说并不像代码清洁度和可读性那么重要.
Ron*_*xão 265
从版本1.7.0开始,NumPy具有choice支持概率分布的功能.
from numpy.random import choice
draw = choice(list_of_candidates, number_of_items_to_pick,
p=probability_distribution)
Run Code Online (Sandbox Code Playgroud)
注意,这probability_distribution是一个顺序相同的顺序list_of_candidates.您还可以使用关键字replace=False更改行为,以便不替换绘制的项目.
vis*_*ell 160
从Python3.6开始choices,random模块中有一个方法.
Python 3.6.1 (v3.6.1:69c0db5050, Mar 21 2017, 01:21:04)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.0.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import random
In [2]: random.choices(
...: population=[['a','b'], ['b','a'], ['c','b']],
...: weights=[0.2, 0.2, 0.6],
...: k=10
...: )
Out[2]:
[['c', 'b'],
['c', 'b'],
['b', 'a'],
['c', 'b'],
['c', 'b'],
['b', 'a'],
['c', 'b'],
['b', 'a'],
['c', 'b'],
['c', 'b']]
Run Code Online (Sandbox Code Playgroud)
人们还提到有random.choices哪些支持权重,但它不支持2d数组,依此类推.
所以,如果你有3.6.x Python,基本上你可以得到你喜欢的任何东西(见更新).k
更新:正如@roganjosh所提到的那样,numpy.choice无法在没有替换的情况下返回值,正如文档中提到的那样:
返回
replace从具有替换的人口中选择的大小的元素列表.
而@ ronan-paixão的精彩回答表明,choices有random控制这种行为的论点.
Ned*_*der 131
def weighted_choice(choices):
total = sum(w for c, w in choices)
r = random.uniform(0, total)
upto = 0
for c, w in choices:
if upto + w >= r:
return c
upto += w
assert False, "Shouldn't get here"
Run Code Online (Sandbox Code Playgroud)
Ray*_*ger 68
0.0 <= x < total.from random import random
from bisect import bisect
def weighted_choice(choices):
values, weights = zip(*choices)
total = 0
cum_weights = []
for w in weights:
total += w
cum_weights.append(total)
x = random() * total
i = bisect(cum_weights, x)
return values[i]
>>> weighted_choice([("WHITE",90), ("RED",8), ("GREEN",2)])
'WHITE'
Run Code Online (Sandbox Code Playgroud)
如果您需要进行多项选择,请将其拆分为两个函数,一个用于构建累积权重,另一个用于分割为随机点.
小智 19
如果你不介意使用numpy,你可以使用numpy.random.choice.
例如:
import numpy
items = [["item1", 0.2], ["item2", 0.3], ["item3", 0.45], ["item4", 0.05]
elems = [i[0] for i in items]
probs = [i[1] for i in items]
trials = 1000
results = [0] * len(items)
for i in range(trials):
res = numpy.random.choice(items, p=probs) #This is where the item is selected!
results[items.index(res)] += 1
results = [r / float(trials) for r in results]
print "item\texpected\tactual"
for i in range(len(probs)):
print "%s\t%0.4f\t%0.4f" % (items[i], probs[i], results[i])
Run Code Online (Sandbox Code Playgroud)
如果你知道需要提前做出多少选择,你可以不用这样的循环来做:
numpy.random.choice(items, trials, p=probs)
Run Code Online (Sandbox Code Playgroud)
Pau*_*McG 15
原油,但可能就足够了:
import random
weighted_choice = lambda s : random.choice(sum(([v]*wt for v,wt in s),[]))
Run Code Online (Sandbox Code Playgroud)
它有用吗?
# define choices and relative weights
choices = [("WHITE",90), ("RED",8), ("GREEN",2)]
# initialize tally dict
tally = dict.fromkeys(choices, 0)
# tally up 1000 weighted choices
for i in xrange(1000):
tally[weighted_choice(choices)] += 1
print tally.items()
Run Code Online (Sandbox Code Playgroud)
打印:
[('WHITE', 904), ('GREEN', 22), ('RED', 74)]
Run Code Online (Sandbox Code Playgroud)
假设所有权重都是整数.它们不需要加100,我只是这样做,以使测试结果更容易解释.(如果权重是浮点数,则将它们全部乘以10,直到所有权重> = 1.)
weights = [.6, .2, .001, .199]
while any(w < 1.0 for w in weights):
weights = [w*10 for w in weights]
weights = map(int, weights)
Run Code Online (Sandbox Code Playgroud)
Max*_*ime 14
如果你有加权字典而不是列表,你可以写这个
items = { "a": 10, "b": 5, "c": 1 }
random.choice([k for k in items for dummy in range(items[k])])
Run Code Online (Sandbox Code Playgroud)
请注意,[k for k in items for dummy in range(items[k])]生成此列表['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'b', 'b', 'b', 'b', 'b']
Nic*_*eli 12
从Python开始v3.6,random.choices可以使用list可选权重从给定的总体中返回指定大小的元素.
random.choices(population, weights=None, *, cum_weights=None, k=1)
人口:list包含独特的观察.(如果是空的,加注IndexError)
权重:更精确地说,选择所需的相对权重.
cum_weights:进行选择所需的累积权重.
k:要输出的size(len)list.(默认len()=1)
几点注意事项:
1)它使用带有替换的加权采样,以便稍后替换所绘制的项目.权重序列中的值本身并不重要,但它们的相对比例确实如此.
与np.random.choice此不同的是,只能将概率作为权重,并且必须确保个别概率的总和达到1个标准,这里没有这样的规定.只要它们属于数字类型(类型int/float/fraction除外Decimal),它们仍然会执行.
>>> import random
# weights being integers
>>> random.choices(["white", "green", "red"], [12, 12, 4], k=10)
['green', 'red', 'green', 'white', 'white', 'white', 'green', 'white', 'red', 'white']
# weights being floats
>>> random.choices(["white", "green", "red"], [.12, .12, .04], k=10)
['white', 'white', 'green', 'green', 'red', 'red', 'white', 'green', 'white', 'green']
# weights being fractions
>>> random.choices(["white", "green", "red"], [12/100, 12/100, 4/100], k=10)
['green', 'green', 'white', 'red', 'green', 'red', 'white', 'green', 'green', 'green']
Run Code Online (Sandbox Code Playgroud)
2)如果既未指定权重也未指定cum_weights,则以相等的概率进行选择.如果提供了权重序列,则它必须与总体序列的长度相同.
指定权重和cum_weights会引发一个TypeError.
>>> random.choices(["white", "green", "red"], k=10)
['white', 'white', 'green', 'red', 'red', 'red', 'white', 'white', 'white', 'green']
Run Code Online (Sandbox Code Playgroud)
3)cum_weights通常是itertools.accumulate函数的结果,在这种情况下非常方便.
从链接的文档:
在内部,相对权重在进行选择之前会转换为累积权重,因此提供累积权重可以节省工作量.
因此,供应weights=[12, 12, 4]或cum_weights=[12, 24, 28]为我们的设计案例产生相同的结果,而后者似乎更快/更有效.
Ray*_*ger 11
这是包含在Python 3.6标准库中的版本:
import itertools as _itertools
import bisect as _bisect
class Random36(random.Random):
"Show the code included in the Python 3.6 version of the Random class"
def choices(self, population, weights=None, *, cum_weights=None, k=1):
"""Return a k sized list of population elements chosen with replacement.
If the relative weights or cumulative weights are not specified,
the selections are made with equal probability.
"""
random = self.random
if cum_weights is None:
if weights is None:
_int = int
total = len(population)
return [population[_int(random() * total)] for i in range(k)]
cum_weights = list(_itertools.accumulate(weights))
elif weights is not None:
raise TypeError('Cannot specify both weights and cumulative weights')
if len(cum_weights) != len(population):
raise ValueError('The number of weights does not match the population')
bisect = _bisect.bisect
total = cum_weights[-1]
return [population[bisect(cum_weights, random() * total)] for i in range(k)]
Run Code Online (Sandbox Code Playgroud)
资料来源:https: //hg.python.org/cpython/file/tip/Lib/random.py#l340
小智 6
加权选择的一种非常基本且简单的方法如下:
np.random.choice(['A', 'B', 'C'], p=[0.3, 0.4, 0.3])
Run Code Online (Sandbox Code Playgroud)
import numpy as np
w=np.array([ 0.4, 0.8, 1.6, 0.8, 0.4])
np.random.choice(w, p=w/sum(w))
Run Code Online (Sandbox Code Playgroud)