Pet*_* O. 26 language-agnostic random algorithm
有没有一种有效的方法来生成 N 个整数的随机组合,使得——
min, max] 中,sum,是否有类似的随机组合算法,其中整数必须按其值(而不是任何顺序)按排序顺序出现?
(选择具有平均值的适当组合mean是一种特殊情况,如果sum = N * mean。这个问题相当于生成sumN 个部分的均匀随机分区,每个部分都在区间 [ min, max] 中并以任何顺序出现或按它们的排序顺序出现值,视情况而定。)
我知道对于以随机顺序出现的组合,可以通过以下方式解决此问题(编辑 [Apr. 27]:算法修改。):
如果N * max < sum或N * min > sum,则无解。
如果N * max == sum,则只有一个解,其中所有N数字都等于max。如果N * min == sum,则只有一个解,其中所有N数字都等于min。
使用Smith 和 Tromble(“Sampling from the Unit Simplex”,2004)中给出的算法生成 N 个随机非负整数,其总和为sum - N * min。
添加min到以这种方式生成的每个数字。
如果任何数字大于max,则转到步骤 3。
但是,如果max远小于,则该算法很慢sum。例如,根据我的测试(上面涉及的特殊情况的实现mean),算法平均拒绝——
N = 7, min = 3, max = 10, sum = 42,但是N = 20, min = 3, max = 10, sum = 120.有没有办法修改这个算法,使其对大 N 有效,同时仍然满足上述要求?
编辑:
作为评论中建议的替代方法,产生有效随机组合(满足除最后一个要求之外的所有要求)的有效方法是:
X,尽可能为有效组合的数量sum,min和max。Y, 中的一个统一随机整数[0, X)。Y转换为有效的组合。但是,是否有计算有效组合(或排列)数量的公式,有没有办法将整数转换为有效组合?[编辑(4 月 28 日):排列而不是组合相同]。
编辑(4 月 27 日):
阅读 Devroye 的Non-Uniform Random Variate Generation (1986) 后,我可以确认这是生成随机分区的问题。此外,第 661 页的练习 2(尤其是 E 部分)与此问题相关。
编辑(4 月 28 日):
事实证明,我给出的算法是统一的,其中所涉及的整数以随机顺序给出,而不是按它们的值排序。由于这两个问题都具有普遍意义,因此我修改了这个问题以寻求两个问题的规范答案。
以下 Ruby 代码可用于验证一致性的潜在解决方案(其中algorithm(...)是候选算法):
combos={}
permus={}
mn=0
mx=6
sum=12
for x in mn..mx
for y in mn..mx
for z in mn..mx
if x+y+z==sum
permus[[x,y,z]]=0
end
if x+y+z==sum and x<=y and y<=z
combos[[x,y,z]]=0
end
end
end
end
3000.times {|x|
f=algorithm(3,sum,mn,mx)
combos[f.sort]+=1
permus[f]+=1
}
p combos
p permus
Run Code Online (Sandbox Code Playgroud)
编辑(4 月 29 日):重新添加了当前实现的 Ruby 代码。
以下代码示例是在 Ruby 中给出的,但我的问题与编程语言无关:
def posintwithsum(n, total)
raise if n <= 0 or total <=0
ls = [0]
ret = []
while ls.length < n
c = 1+rand(total-1)
found = false
for j in 1...ls.length
if ls[j] == c
found = true
break
end
end
if found == false;ls.push(c);end
end
ls.sort!
ls.push(total)
for i in 1...ls.length
ret.push(ls[i] - ls[i - 1])
end
return ret
end
def integersWithSum(n, total)
raise if n <= 0 or total <=0
ret = posintwithsum(n, total + n)
for i in 0...ret.length
ret[i] = ret[i] - 1
end
return ret
end
# Generate 100 valid samples
mn=3
mx=10
sum=42
n=7
100.times {
while true
pp=integersWithSum(n,sum-n*mn).map{|x| x+mn }
if !pp.find{|x| x>mx }
p pp; break # Output the sample and break
end
end
}
Run Code Online (Sandbox Code Playgroud)
这是我在 Java 中的解决方案。它功能齐全,包含两个生成器:PermutationPartitionGenerator用于未排序分区和CombinationPartitionGenerator用于排序分区。您的生成器也在类中实现以SmithTromblePartitionGenerator进行比较。该类SequentialEnumerator按顺序枚举所有可能的分区(未排序或排序,取决于参数)。我为所有这些生成器添加了全面的测试(包括您的测试用例)。在大多数情况下,实现是不言自明的。如果你有任何问题,我会在几天内回答。
import java.util.Random;
import java.util.function.Supplier;
public abstract class PartitionGenerator implements Supplier<int[]>{
public static final Random rand = new Random();
protected final int numberCount;
protected final int min;
protected final int range;
protected final int sum; // shifted sum
protected final boolean sorted;
protected PartitionGenerator(int numberCount, int min, int max, int sum, boolean sorted) {
if (numberCount <= 0)
throw new IllegalArgumentException("Number count should be positive");
this.numberCount = numberCount;
this.min = min;
range = max - min;
if (range < 0)
throw new IllegalArgumentException("min > max");
sum -= numberCount * min;
if (sum < 0)
throw new IllegalArgumentException("Sum is too small");
if (numberCount * range < sum)
throw new IllegalArgumentException("Sum is too large");
this.sum = sum;
this.sorted = sorted;
}
// Whether this generator returns sorted arrays (i.e. combinations)
public final boolean isSorted() {
return sorted;
}
public interface GeneratorFactory {
PartitionGenerator create(int numberCount, int min, int max, int sum);
}
}
import java.math.BigInteger;
// Permutations with repetition (i.e. unsorted vectors) with given sum
public class PermutationPartitionGenerator extends PartitionGenerator {
private final double[][] distributionTable;
public PermutationPartitionGenerator(int numberCount, int min, int max, int sum) {
super(numberCount, min, max, sum, false);
distributionTable = calculateSolutionCountTable();
}
private double[][] calculateSolutionCountTable() {
double[][] table = new double[numberCount + 1][sum + 1];
BigInteger[] a = new BigInteger[sum + 1];
BigInteger[] b = new BigInteger[sum + 1];
for (int i = 1; i <= sum; i++)
a[i] = BigInteger.ZERO;
a[0] = BigInteger.ONE;
table[0][0] = 1.0;
for (int n = 1; n <= numberCount; n++) {
double[] t = table[n];
for (int s = 0; s <= sum; s++) {
BigInteger z = BigInteger.ZERO;
for (int i = Math.max(0, s - range); i <= s; i++)
z = z.add(a[i]);
b[s] = z;
t[s] = z.doubleValue();
}
// swap a and b
BigInteger[] c = b;
b = a;
a = c;
}
return table;
}
@Override
public int[] get() {
int[] p = new int[numberCount];
int s = sum; // current sum
for (int i = numberCount - 1; i >= 0; i--) {
double t = rand.nextDouble() * distributionTable[i + 1][s];
double[] tableRow = distributionTable[i];
int oldSum = s;
// lowerBound is introduced only for safety, it shouldn't be crossed
int lowerBound = s - range;
if (lowerBound < 0)
lowerBound = 0;
s++;
do
t -= tableRow[--s];
// s can be equal to lowerBound here with t > 0 only due to imprecise subtraction
while (t > 0 && s > lowerBound);
p[i] = min + (oldSum - s);
}
assert s == 0;
return p;
}
public static final GeneratorFactory factory = (numberCount, min, max,sum) ->
new PermutationPartitionGenerator(numberCount, min, max, sum);
}
import java.math.BigInteger;
// Combinations with repetition (i.e. sorted vectors) with given sum
public class CombinationPartitionGenerator extends PartitionGenerator {
private final double[][][] distributionTable;
public CombinationPartitionGenerator(int numberCount, int min, int max, int sum) {
super(numberCount, min, max, sum, true);
distributionTable = calculateSolutionCountTable();
}
private double[][][] calculateSolutionCountTable() {
double[][][] table = new double[numberCount + 1][range + 1][sum + 1];
BigInteger[][] a = new BigInteger[range + 1][sum + 1];
BigInteger[][] b = new BigInteger[range + 1][sum + 1];
double[][] t = table[0];
for (int m = 0; m <= range; m++) {
a[m][0] = BigInteger.ONE;
t[m][0] = 1.0;
for (int s = 1; s <= sum; s++) {
a[m][s] = BigInteger.ZERO;
t[m][s] = 0.0;
}
}
for (int n = 1; n <= numberCount; n++) {
t = table[n];
for (int m = 0; m <= range; m++)
for (int s = 0; s <= sum; s++) {
BigInteger z;
if (m == 0)
z = a[0][s];
else {
z = b[m - 1][s];
if (m <= s)
z = z.add(a[m][s - m]);
}
b[m][s] = z;
t[m][s] = z.doubleValue();
}
// swap a and b
BigInteger[][] c = b;
b = a;
a = c;
}
return table;
}
@Override
public int[] get() {
int[] p = new int[numberCount];
int m = range; // current max
int s = sum; // current sum
for (int i = numberCount - 1; i >= 0; i--) {
double t = rand.nextDouble() * distributionTable[i + 1][m][s];
double[][] tableCut = distributionTable[i];
if (s < m)
m = s;
s -= m;
while (true) {
t -= tableCut[m][s];
// m can be 0 here with t > 0 only due to imprecise subtraction
if (t <= 0 || m == 0)
break;
m--;
s++;
}
p[i] = min + m;
}
assert s == 0;
return p;
}
public static final GeneratorFactory factory = (numberCount, min, max, sum) ->
new CombinationPartitionGenerator(numberCount, min, max, sum);
}
import java.util.*;
public class SmithTromblePartitionGenerator extends PartitionGenerator {
public SmithTromblePartitionGenerator(int numberCount, int min, int max, int sum) {
super(numberCount, min, max, sum, false);
}
@Override
public int[] get() {
List<Integer> ls = new ArrayList<>(numberCount + 1);
int[] ret = new int[numberCount];
int increasedSum = sum + numberCount;
while (true) {
ls.add(0);
while (ls.size() < numberCount) {
int c = 1 + rand.nextInt(increasedSum - 1);
if (!ls.contains(c))
ls.add(c);
}
Collections.sort(ls);
ls.add(increasedSum);
boolean good = true;
for (int i = 0; i < numberCount; i++) {
int x = ls.get(i + 1) - ls.get(i) - 1;
if (x > range) {
good = false;
break;
}
ret[i] = x;
}
if (good) {
for (int i = 0; i < numberCount; i++)
ret[i] += min;
return ret;
}
ls.clear();
}
}
public static final GeneratorFactory factory = (numberCount, min, max, sum) ->
new SmithTromblePartitionGenerator(numberCount, min, max, sum);
}
import java.util.Arrays;
// Enumerates all partitions with given parameters
public class SequentialEnumerator extends PartitionGenerator {
private final int max;
private final int[] p;
private boolean finished;
public SequentialEnumerator(int numberCount, int min, int max, int sum, boolean sorted) {
super(numberCount, min, max, sum, sorted);
this.max = max;
p = new int[numberCount];
startOver();
}
private void startOver() {
finished = false;
int unshiftedSum = sum + numberCount * min;
fillMinimal(0, Math.max(min, unshiftedSum - (numberCount - 1) * max), unshiftedSum);
}
private void fillMinimal(int beginIndex, int minValue, int fillSum) {
int fillRange = max - minValue;
if (fillRange == 0)
Arrays.fill(p, beginIndex, numberCount, max);
else {
int fillCount = numberCount - beginIndex;
fillSum -= fillCount * minValue;
int maxCount = fillSum / fillRange;
int maxStartIndex = numberCount - maxCount;
Arrays.fill(p, maxStartIndex, numberCount, max);
fillSum -= maxCount * fillRange;
Arrays.fill(p, beginIndex, maxStartIndex, minValue);
if (fillSum != 0)
p[maxStartIndex - 1] = minValue + fillSum;
}
}
@Override
public int[] get() { // returns null when there is no more partition, then starts over
if (finished) {
startOver();
return null;
}
int[] pCopy = p.clone();
if (numberCount > 1) {
int i = numberCount;
int s = p[--i];
while (i > 0) {
int x = p[--i];
if (x == max) {
s += x;
continue;
}
x++;
s--;
int minRest = sorted ? x : min;
if (s < minRest * (numberCount - i - 1)) {
s += x;
continue;
}
p[i++]++;
fillMinimal(i, minRest, s);
return pCopy;
}
}
finished = true;
return pCopy;
}
public static final GeneratorFactory permutationFactory = (numberCount, min, max, sum) ->
new SequentialEnumerator(numberCount, min, max, sum, false);
public static final GeneratorFactory combinationFactory = (numberCount, min, max, sum) ->
new SequentialEnumerator(numberCount, min, max, sum, true);
}
import java.util.*;
import java.util.function.BiConsumer;
import PartitionGenerator.GeneratorFactory;
public class Test {
private final int numberCount;
private final int min;
private final int max;
private final int sum;
private final int repeatCount;
private final BiConsumer<PartitionGenerator, Test> procedure;
public Test(int numberCount, int min, int max, int sum, int repeatCount,
BiConsumer<PartitionGenerator, Test> procedure) {
this.numberCount = numberCount;
this.min = min;
this.max = max;
this.sum = sum;
this.repeatCount = repeatCount;
this.procedure = procedure;
}
@Override
public String toString() {
return String.format("=== %d numbers from [%d, %d] with sum %d, %d iterations ===",
numberCount, min, max, sum, repeatCount);
}
private static class GeneratedVector {
final int[] v;
GeneratedVector(int[] vect) {
v = vect;
}
@Override
public int hashCode() {
return Arrays.hashCode(v);
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
return Arrays.equals(v, ((GeneratedVector)obj).v);
}
@Override
public String toString() {
return Arrays.toString(v);
}
}
private static final Comparator<Map.Entry<GeneratedVector, Integer>> lexicographical = (e1, e2) -> {
int[] v1 = e1.getKey().v;
int[] v2 = e2.getKey().v;
int len = v1.length;
int d = len - v2.length;
if (d != 0)
return d;
for (int i = 0; i < len; i++) {
d = v1[i] - v2[i];
if (d != 0)
return d;
}
return 0;
};
private static final Comparator<Map.Entry<GeneratedVector, Integer>> byCount =
Comparator.<Map.Entry<GeneratedVector, Integer>>comparingInt(Map.Entry::getValue)
.thenComparing(lexicographical);
public static int SHOW_MISSING_LIMIT = 10;
private static void checkMissingPartitions(Map<GeneratedVector, Integer> map, PartitionGenerator reference) {
int missingCount = 0;
while (true) {
int[] v = reference.get();
if (v == null)
break;
GeneratedVector gv = new GeneratedVector(v);
if (!map.containsKey(gv)) {
if (missingCount == 0)
System.out.println(" Missing:");
if (++missingCount > SHOW_MISSING_LIMIT) {
System.out.println(" . . .");
break;
}
System.out.println(gv);
}
}
}
public static final BiConsumer<PartitionGenerator, Test> distributionTest(boolean sortByCount) {
return (PartitionGenerator gen, Test test) -> {
System.out.print("\n" + getName(gen) + "\n\n");
Map<GeneratedVector, Integer> combos = new HashMap<>();
// There's no point of checking permus for sorted generators
// because they are the same as combos for them
Map<GeneratedVector, Integer> permus = gen.isSorted() ? null : new HashMap<>();
for (int i = 0; i < test.repeatCount; i++) {
int[] v = gen.get();
if (v == null && gen instanceof SequentialEnumerator)
break;
if (permus != null) {
permus.merge(new GeneratedVector(v), 1, Integer::sum);
v = v.clone();
Arrays.sort(v);
}
combos.merge(new GeneratedVector(v), 1, Integer::sum);
}
Set<Map.Entry<GeneratedVector, Integer>> sortedEntries = new TreeSet<>(
sortByCount ? byCount : lexicographical);
System.out.println("Combos" + (gen.isSorted() ? ":" : " (don't have to be uniform):"));
sortedEntries.addAll(combos.entrySet());
for (Map.Entry<GeneratedVector, Integer> e : sortedEntries)
System.out.println(e);
checkMissingPartitions(combos, test.getGenerator(SequentialEnumerator.combinationFactory));
if (permus != null) {
System.out.println("\nPermus:");
sortedEntries.clear();
sortedEntries.addAll(permus.entrySet());
for (Map.Entry<GeneratedVector, Integer> e : sortedEntries)
System.out.println(e);
checkMissingPartitions(permus, test.getGenerator(SequentialEnumerator.permutationFactory));
}
};
}
public static final BiConsumer<PartitionGenerator, Test> correctnessTest =
(PartitionGenerator gen, Test test) -> {
String genName = getName(gen);
for (int i = 0; i < test.repeatCount; i++) {
int[] v = gen.get();
if (v == null && gen instanceof SequentialEnumerator)
v = gen.get();
if (v.length != test.numberCount)
throw new RuntimeException(genName + ": array of wrong length");
int s = 0;
if (gen.isSorted()) {
if (v[0] < test.min || v[v.length - 1] > test.max)
throw new RuntimeException(genName + ": generated number is out of range");
int prev = test.min;
for (int x : v) {
if (x < prev)
throw new RuntimeException(genName + ": unsorted array");
s += x;
prev = x;
}
} else
for (int x : v) {
if (x < test.min || x > test.max)
throw new RuntimeException(genName + ": generated number is out of range");
s += x;
}
if (s != test.sum)
throw new RuntimeException(genName + ": wrong sum");
}
System.out.format("%30s : correctness test passed%n", genName);
};
public static final BiConsumer<PartitionGenerator, Test> performanceTest =
(PartitionGenerator gen, Test test) -> {
long time = System.nanoTime();
for (int i = 0; i < test.repeatCount; i++)
gen.get();
time = System.nanoTime() - time;
System.out.format("%30s : %8.3f s %10.0f ns/test%n", getName(gen), time * 1e-9, time * 1.0 / test.repeatCount);
};
public PartitionGenerator getGenerator(GeneratorFactory factory) {
return factory.create(numberCount, min, max, sum);
}
public static String getName(PartitionGenerator gen) {
String name = gen.getClass().getSimpleName();
if (gen instanceof SequentialEnumerator)
return (gen.isSorted() ? "Sorted " : "Unsorted ") + name;
else
return name;
}
public static GeneratorFactory[] factories = { SmithTromblePartitionGenerator.factory,
PermutationPartitionGenerator.factory, CombinationPartitionGenerator.factory,
SequentialEnumerator.permutationFactory, SequentialEnumerator.combinationFactory };
public static void main(String[] args) {
Test[] tests = {
new Test(3, 0, 3, 5, 3_000, distributionTest(false)),
new Test(3, 0, 6, 12, 3_000, distributionTest(true)),
new Test(50, -10, 20, 70, 2_000, correctnessTest),
new Test(7, 3, 10, 42, 1_000_000, performanceTest),
new Test(20, 3, 10, 120, 100_000, performanceTest)
};
for (Test t : tests) {
System.out.println(t);
for (GeneratorFactory factory : factories) {
PartitionGenerator candidate = t.getGenerator(factory);
t.procedure.accept(candidate, t);
}
System.out.println();
}
}
}
Run Code Online (Sandbox Code Playgroud)
你可以在 Ideone 上试试这个。
这是 John McClane 的 PermutationPartitionGenerator 的算法,在本页的另一个答案中。它有两个阶段,即设置阶段和采样阶段,并n在 [ min, max] 中生成随机数与总和sum,其中数字以随机顺序列出。
设置阶段:首先,使用以下公式构建解表(t(y, x)其中y位于 [0, n] 中且x位于 [0, sum - n * min] 中):
此处,t(y, x) 存储y数字之和(在适当范围内)等于的相对概率x。这个概率是相对于所有具有相同 的 t(y, x) 的y。
采样阶段:在这里我们生成一个n数字样本。设置s为sum - n * min,然后对于每个位置i,从n - 10开始并向后工作:
v为 [0, t(i+1, s)) 中的随机整数。r为min。v。v仍然为 0 或更大时,从 中减去 t(i, s-1) v,在 中加 1 r,从 中减去 1 s。i样本中位置的编号设置为r。编辑:
似乎对上面的算法进行微不足道的更改,可以让每个随机数使用单独的范围,而不是对所有随机数使用相同的范围:
位置i∈ [0, n)处的每个随机数具有最小值 min(i) 和最大值 max(i)。
让adjsum= sum- Σmin(i)。
设置阶段:首先,使用以下公式构建解表(t(y, x)其中y位于 [0, n] 中且x位于 [0, adjsum] 中):
采样阶段与之前完全相同,除了我们设置s为adjsum(而不是sum - n * min) 并设置r为 min(i)(而不是min)。
编辑:
对于 John McClane 的 CombinationPartitionGenerator,设置和采样阶段如下。
设置阶段:首先,使用以下公式构建解表(t(z, y, x)其中z在[0, n]中,y在[0, max - min]中,x在[0, sum - n * min]中):
采样阶段:在这里我们生成一个n数字样本。设置s要sum - n * min和mrange到max - min,然后为每个位置i,开始n - 1和向后工作,以0:
v为 [0, t(i+1, mrange, s)) 中的随机整数。mrange为 min( mrange, s)mrange从s。r为min + mrange。i, mrange, s) v。v保持为0或更大,加1 s,减1从r来自和1 mrange,然后减去吨(i,mrange,s从)v。i样本中位置的编号设置为r。