随机百分比分支的编码模式？

Question

随机百分比分支的编码模式？

Mof*_*ast 51 java random design-patterns

因此,假设我们有一个代码块,我们希望执行70%的次数,另一次执行30%的代码.

if(Math.random() < 0.7)
    70percentmethod();
else
    30percentmethod();

Run Code Online (Sandbox Code Playgroud)

很简单.但是如果我们希望它可以很容易地扩展到30%/ 60%/ 10%等怎么办？在这里,它需要添加和更改所有关于变化的if语句,这些语句不是很好用,慢和错误诱导.

到目前为止,我发现大型交换机对于这个用例非常有用,例如:

switch(rand(0, 10)){
    case 0:
    case 1:
    case 2:
    case 3:
    case 4:
    case 5:
    case 6:
    case 7:70percentmethod();break;
    case 8:
    case 9:
    case 10:30percentmethod();break;
}

Run Code Online (Sandbox Code Playgroud)

哪个可以很容易地改为:

switch(rand(0, 10)){
    case 0:10percentmethod();break;
    case 1:
    case 2:
    case 3:
    case 4:
    case 5:
    case 6:
    case 7:60percentmethod();break;
    case 8:
    case 9:
    case 10:30percentmethod();break;
}

Run Code Online (Sandbox Code Playgroud)

但是这些也有它们的缺点,很麻烦并且分成预定量的分区.

理想的东西将基于我想的"频率数"系统,如下所示:

(1,a),(1,b),(2,c) -> 25% a, 25% b, 50% c

Run Code Online (Sandbox Code Playgroud)

然后,如果你添加另一个:

(1,a),(1,b),(2,c),(6,d) -> 10% a, 10% b, 20% c, 60% d

Run Code Online (Sandbox Code Playgroud)

所以简单地将数字相加,使总和等于100%,然后将其分开.

我想用自定义的hashmap或者其他东西为它做一个处理程序并不会那么麻烦,但是我想知道在我去所有的意大利面之前是否有一些已建立的方式/模式或lambda.

Answer 1

dan*_*niu 28

编辑:请参阅最后的编辑以获得更优雅的解决方案.我会留下这个.

您可以使用a NavigableMap来存储映射到其百分比的这些方法.

NavigableMap<Double, Runnable> runnables = new TreeMap<>();

runnables.put(0.3, this::30PercentMethod);
runnables.put(1.0, this::70PercentMethod);

public static void runRandomly(Map<Double, Runnable> runnables) {
    double percentage = Math.random();
    for (Map.Entry<Double, Runnable> entry : runnables){
        if (entry.getKey() < percentage) {
            entry.getValue().run();
            return; // make sure you only call one method
        }
    }
    throw new RuntimeException("map not filled properly for " + percentage);
}

// or, because I'm still practicing streams by using them for everything
public static void runRandomly(Map<Double, Runnable> runnables) {
    double percentage = Math.random();
    runnables.entrySet().stream()
        .filter(e -> e.getKey() < percentage)
        .findFirst().orElseThrow(() -> 
                new RuntimeException("map not filled properly for " + percentage))
        .run();
}

Run Code Online (Sandbox Code Playgroud)

将NavigableMap被分类(例如HashMap没有给出该项目的担保)的按键,让您获得通过他们的百分比排序的条目.这是相关的,因为如果你有两个项目(3,R1) ,(7,R2) ,它们会导致以下条目:r1 = 0.3和r2 = 1.0他们需要的顺序(被评估例如,如果它们以相反的顺序结果评估将永远是r2).

至于分裂,它应该是这样的:像这样的元组类

static class Pair<X, Y>
{
    public Pair(X f, Y s)
    {
        first = f;
        second = s;
    }

    public final X first;
    public final Y second;
}

Run Code Online (Sandbox Code Playgroud)

您可以创建这样的地图

// the parameter contains the (1,m1), (1,m2), (3,m3) pairs
private static Map<Double,Runnable> splitToPercentageMap(Collection<Pair<Integer,Runnable>> runnables)
{

    // this adds all Runnables to lists of same int value,
    // overall those lists are sorted by that int (so least probable first)
    double total = 0;
    Map<Integer,List<Runnable>> byNumber = new TreeMap<>();
    for (Pair<Integer,Runnable> e : runnables)
    {
        total += e.first;
        List<Runnable> list = byNumber.getOrDefault(e.first, new ArrayList<>());
        list.add(e.second);
        byNumber.put(e.first, list);
    }

    Map<Double,Runnable> targetList = new TreeMap<>();
    double current = 0;
    for (Map.Entry<Integer,List<Runnable>> e : byNumber.entrySet())
    {
        for (Runnable r : e.getValue())
        {
            double percentage = (double) e.getKey() / total;
            current += percentage;
            targetList.put(current, r);
        }
    }

    return targetList;
}

Run Code Online (Sandbox Code Playgroud)

所有这些都增加了一个类

class RandomRunner {
    private List<Integer, Runnable> runnables = new ArrayList<>();
    public void add(int value, Runnable toRun) {
        runnables.add(new Pair<>(value, toRun));
    }
    public void remove(Runnable toRemove) {
        for (Iterator<Pair<Integer, Runnable>> r = runnables.iterator();
            r.hasNext(); ) {
            if (toRemove == r.next().second) {
               r.remove();
               break;
            }
        }
    }
    public void runRandomly() {
        // split list, use code from above
    }
}

Run Code Online (Sandbox Code Playgroud)

编辑:
实际上,如果你的想法陷入困境并且没有正确地质疑,那么上面就是你得到的.保持RandomRunner类接口,这更容易:

class RandomRunner {
    List<Runnable> runnables = new ArrayList<>();
    public void add(int value, Runnable toRun) {
        // add the methods as often as their weight indicates.
        // this should be fine for smaller numbers;
        // if you get lists with millions of entries, optimize
        for (int i = 0; i < value; i++) {
            runnables.add(toRun);
        }
    }
    public void remove(Runnable r) {
        Iterator<Runnable> myRunnables = runnables.iterator();
        while (myRunnables.hasNext()) {
            if (myRunnables.next() == r) {
                myRunnables.remove();
            }
    }
    public void runRandomly() {
        if (runnables.isEmpty()) return;
        // roll n-sided die
        int runIndex = ThreadLocalRandom.current().nextInt(0, runnables.size());
        runnables.get(runIndex).run();
    }
}

Run Code Online (Sandbox Code Playgroud)

30PercentMethod和70PercentMethod不是有效的Java方法名称 (7认同)
@Michael你是对的.我只是重复使用OP在他的问题中给出的名字. (7认同)
我很惊讶这个答案如此受欢迎(没有冒犯).如果你有更多的方法可以添加到地图中,那么每个方法发生的可能性实际上并不明显 - 你需要从它前面的那个值中减去每个值.它也不允许你有两种方法加权相同. (4认同)
它确实允许多个同等加权的结果,因为密钥是*累积*概率.因此,如果您希望结果A,B,C的概率为0.25,0.25,0.5,那么您将拥有(0.25,A),(0.5,B)和(1.0,C). (2认同)
@Michael我有点惊讶自己.我确实为答案添加了一个更简单的解决方案,应该解决您的问题. (2认同)

Answer 2

jpa*_*jpa 25

所有这些答案看起来都很复杂,所以我只想发布保持简单的替代方案:

double rnd = Math.random()
if((rnd -= 0.6) < 0)
    60percentmethod();
else if ((rnd -= 0.3) < 0)
    30percentmethod();
else
    10percentmethod();

Run Code Online (Sandbox Code Playgroud)

不需要更改其他行,人们可以很容易地看到会发生什么,而无需深入研究辅助类.一个小的缺点是它不会强制百分比总和达到100%.

使用`if(rnd <0.6)`意味着将下一个if作为`if(rnd <0.9)`,即跟踪早期ifs的百分比总和.只有3个或4个选项不成问题,但想象一下如果你有30个然后改变了第一个的权重,你必须改变每个后续if语句的权重.这样每个权重只与它自己的if语句绑定,除了当然结束时的else (2认同)

Answer 3

Ste*_*low 16

我不确定这个名字是否有一个共同的名字,但我想我在大学里学到了这一点.

它基本上就像你描述的那样工作:它接收一个值列表和"频率数",并根据加权概率选择一个.

list = (1,a),(1,b),(2,c),(6,d)

total = list.sum()
rnd = random(0, total)
sum = 0
for i from 0 to list.size():
    sum += list[i]
    if sum >= rnd:
        return list[i]
return list.last()

Run Code Online (Sandbox Code Playgroud)

如果要概括它,列表可以是函数参数.

这也适用于浮点数,并且数字不必标准化.如果规范化(例如总计为1),则可以跳过该list.sum()部分.

编辑:

由于这里的需求是一个实际的编译java实现和用法示例:

import java.util.ArrayList;
import java.util.Random;

public class RandomWheel<T>
{
  private static final class RandomWheelSection<T>
  {
    public double weight;
    public T value;

    public RandomWheelSection(double weight, T value)
    {
      this.weight = weight;
      this.value = value;
    }
  }

  private ArrayList<RandomWheelSection<T>> sections = new ArrayList<>();
  private double totalWeight = 0;
  private Random random = new Random();

  public void addWheelSection(double weight, T value)
  {
    sections.add(new RandomWheelSection<T>(weight, value));
    totalWeight += weight;
  }

  public T draw()
  {
    double rnd = totalWeight * random.nextDouble();

    double sum = 0;
    for (int i = 0; i < sections.size(); i++)
    {
      sum += sections.get(i).weight;
      if (sum >= rnd)
        return sections.get(i).value;
    }
    return sections.get(sections.size() - 1).value;
  }

  public static void main(String[] args)
  {
    RandomWheel<String> wheel = new RandomWheel<String>();
    wheel.addWheelSection(1, "a");
    wheel.addWheelSection(1, "b");
    wheel.addWheelSection(2, "c");
    wheel.addWheelSection(6, "d");

    for (int i = 0; i < 100; i++)
        System.out.print(wheel.draw());
  }
}

Run Code Online (Sandbox Code Playgroud)

没错,但这更像是一个普遍的问题.我相信你知道如何在Java中实现它... (15认同)
我确信Java程序员可以读取伪代码并将其转换为所需的SingletonRunnerFactory调用. (2认同)

Answer 4

whn*_*whn 8

虽然选择的答案有效,但遗憾的是,对于您的用例而言渐渐缓慢.您可以使用名为Alias Sampling的东西,而不是这样做.别名采样(或别名方法)是一种用于选择具有加权分布的元素的技术.如果选择这些元素的权重没有改变,你可以在O(1)时间内进行选择!.如果不是这种情况,如果您所做的选择数量与对别名表所做的更改(更改权重)之间的比率很高,您仍然可以分摊O(1)时间.当前选择的答案表明O(N)算法,下一个最好的事情是给定排序概率的O(log(N))和二分搜索,但没有什么能超过我建议的O(1)时间.

该站点提供了Alias方法的良好概述,该方法主要与语言无关.基本上,您创建一个表,其中每个条目代表两个概率的结果.表格中的每个条目都有一个阈值,低于您获得一个值的阈值,高于您获得的另一个值.您在多个表值之间传播较大的概率,以便为所有概率组合创建面积为1的概率图.

假设你有概率A,B,C和D,它们的值分别为0.1,0.1,0.1和0.7.别名方法会将0.7的概率扩展到所有其他方法.一个指数对应于每个概率,其中ABC为0.1和0.15,D指数为0.25.通过这种方法,你可以将每个概率归一化,这样你最终得到A的概率为0.4,并且在A的指数中得到D的概率为0.6(分别为0.1 /(0.1 + 0.15)和0.15 /(0.1 + 0.15))以及B和C的指数,以及在D指数中获得D的几率为100%(0.25/0.25为1).

给定用于索引的无偏均匀PRNG(Math.Random()),您获得选择每个索引的相等概率,但是您还为每个索引执行硬币翻转,这提供了加权概率.你有25%的几率登陆A或D位置,但在此之内你只有40%的机会选择A,而60%的D. 40*.25 = 0.1,我们的原始概率,如果你将所有D的概率加在其他指数中,你会得到0.70.

所以要做随机选择,你只需要生成一个从0到N的随机索引,然后做一个硬币翻转,无论你添加多少项,这都是非常快速和不变的成本.制作别名表也不需要那么多行代码,我的python版本需要80行,包括import语句和换行符,而Pandas文章中提供的版本大小相同(并且它是C++)

对于你的java实现,可以将概率和数组列表索引映射到你必须执行的函数,创建一个函数数组,这些函数在你为每个函数索引时执行,或者你也可以使用函数对象(函子),它们有你使用的方法传递参数以执行.

ArrayList<(YourFunctionObject)> function_list;
// add functions
AliasSampler aliassampler = new AliasSampler(listOfProbabilities);
// somewhere later with some type T and some parameter values. 
int index = aliassampler.sampleIndex();
T result = function_list[index].apply(parameters);

Run Code Online (Sandbox Code Playgroud)

编辑:

我已经在使用类创建了AliasSampler方法的java版本,这使用了样本索引方法,并且应该能够像上面一样使用.

import java.util.ArrayList;
import java.util.Collections;
import java.util.Random;

public class AliasSampler {
    private ArrayList<Double> binaryProbabilityArray;
    private ArrayList<Integer> aliasIndexList;
    AliasSampler(ArrayList<Double> probabilities){
        // java 8 needed here
        assert(DoubleStream.of(probabilities).sum() == 1.0);
        int n = probabilities.size();
        // probabilityArray is the list of probabilities, this is the incoming probabilities scaled
        // by the number of probabilities.  This allows us to figure out which probabilities need to be spread 
        // to others since they are too large, ie [0.1 0.1 0.1 0.7] = [0.4 0.4 0.4 2.80]
        ArrayList<Double> probabilityArray;
        for(Double probability : probabilities){
            probabilityArray.add(probability);
        }
        binaryProbabilityArray = new ArrayList<Double>(Collections.nCopies(n, 0.0));
        aliasIndexList = new ArrayList<Integer>(Collections.nCopies(n, 0));
        ArrayList<Integer> lessThanOneIndexList = new ArrayList<Integer>();
        ArrayList<Integer> greaterThanOneIndexList = new ArrayList<Integer>();
        for(int index = 0; index < probabilityArray.size(); index++){
            double probability = probabilityArray.get(index);
            if(probability < 1.0){
                lessThanOneIndexList.add(index);
            }
            else{
                greaterThanOneIndexList.add(index);
            }
        }

        // while we still have indices to check for in each list, we attempt to spread the probability of those larger
        // what this ends up doing in our first example is taking greater than one elements (2.80) and removing 0.6, 
        // and spreading it to different indices, so (((2.80 - 0.6) - 0.6) - 0.6) will equal 1.0, and the rest will
        // be 0.4 + 0.6 = 1.0 as well. 
        while(lessThanOneIndexList.size() != 0 && greaterThanOneIndexList.size() != 0){
            //https://stackoverflow.com/questions/16987727/removing-last-object-of-arraylist-in-java
            // last element removal is equivalent to pop, java does this in constant time
            int lessThanOneIndex = lessThanOneIndexList.remove(lessThanOneIndexList.size() - 1);
            int greaterThanOneIndex = greaterThanOneIndexList.remove(greaterThanOneIndexList.size() - 1);
            double probabilityLessThanOne = probabilityArray.get(lessThanOneIndex);
            binaryProbabilityArray.set(lessThanOneIndex, probabilityLessThanOne);
            aliasIndexList.set(lessThanOneIndex, greaterThanOneIndex);
            probabilityArray.set(greaterThanOneIndex, probabilityArray.get(greaterThanOneIndex) + probabilityLessThanOne - 1);
            if(probabilityArray.get(greaterThanOneIndex) < 1){
                lessThanOneIndexList.add(greaterThanOneIndex);
            }
            else{
                greaterThanOneIndexList.add(greaterThanOneIndex);
            }
        }
        //if there are any probabilities left in either index list, they can't be spread across the other 
        //indicies, so they are set with probability 1.0. They still have the probabilities they should at this step, it works out mathematically.
        while(greaterThanOneIndexList.size() != 0){
            int greaterThanOneIndex = greaterThanOneIndexList.remove(greaterThanOneIndexList.size() - 1);
            binaryProbabilityArray.set(greaterThanOneIndex, 1.0);
        }
        while(lessThanOneIndexList.size() != 0){
            int lessThanOneIndex = lessThanOneIndexList.remove(lessThanOneIndexList.size() - 1);
            binaryProbabilityArray.set(lessThanOneIndex, 1.0);
        }
    }
    public int sampleIndex(){
        int index = new Random().nextInt(binaryProbabilityArray.size());
        double r = Math.random();
        if( r < binaryProbabilityArray.get(index)){
            return index;
        }
        else{
            return aliasIndexList.get(index);
        }
    }

}

Run Code Online (Sandbox Code Playgroud)

@Michael虽然我通常同意这个原则,但我认为在这种情况下,实际实现别名表非常简单,它不是那么大的交易,并且使用实际上比标记答案简单得多.此外,问题是他所谈论的"事实编码模式",我认为***别名方法是编码模式***.因此,尽管最佳答案提供了一个简单的解决方案,但它也不是这种问题的标准.我认为这应该类似于在使用链表或散列表时使用数组. (3认同)
@Michael为了避免听起来有防御性,我应该重申我原则上同意,并总结一下,我认为这不属于过早优化的最大原因是我认为别名方法是OP似乎要求的标准编码模式. (2认同)
@Michael我的方法保证至少和你提供的例子一样快,即使是在非常罕见的最佳情况下:Math.random()索引,比较值,返回索引,基于索引执行.这就对了.在任何必须为搜索执行多次迭代的情况下,它总是更快.这不是一些理论上的Fibonacci堆,如果由于高昂的恒定成本给出足够大的N,它可以*变得更好,你可以通过了解它的工作原理来证明它. (2认同)
@Micheal,解释某些东西的需要并没有在任何层面上使它无效,哈希表很难解释,但我怀疑你会对它们的使用做出相同的判断.另外我在这里解释了这个算法,所以你甚至不需要看文章,我列出的python代码不难理解,我根本无法解释你对这部分的困惑.就像我说的那样,23对35？不是那么小.再次,perfgain并不是微不足道的,它的几个数量级增益,其显而易见的是因为它的O(1)与〜= const成本,OP无疑也是循环采样. (2认同)
@Michael我不同意你需要解释某些东西的想法使它不再简单,我也不同意我们不知道这是一个性能问题.你甚至在我回复你之前就决定你"完成"了. (2认同)

Answer 5

NPE*_*NPE 6

你可以计算每个类的累积概率,从[0; 1)并查看该数字下降的位置.

class WeightedRandomPicker {

    private static Random random = new Random();

    public static int choose(double[] probabilties) {
        double randomVal = random.nextDouble();
        double cumulativeProbability = 0;
        for (int i = 0; i < probabilties.length; ++i) {
            cumulativeProbability += probabilties[i];
            if (randomVal < cumulativeProbability) {
                return i;
            }
        }
        return probabilties.length - 1; // to account for numerical errors
    }

    public static void main (String[] args) {
        double[] probabilties = new double[]{0.1, 0.1, 0.2, 0.6}; // the final value is optional
        for (int i = 0; i < 20; ++i) {
            System.out.printf("%d\n", choose(probabilties));
        }
    }
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，5 月前
查看次数：	4116 次
最近记录：	8 年，1 月前