通过6面模具实现提高7面模具辊模拟的性能

Question

通过6面模具实现提高7面模具辊模拟的性能

jxh*_*jxh 16 c random algorithm performance

在不同的StackExchange中,以下算法(基于算术编码)被提出作为生成7面骰子结果的有效方法,当给出的所有内容都是6面骰子时:

int rand7()
{
  static double a=0, width=7;  // persistent state

  while ((int)(a+width) != (int)a)
  {
    width /= 6;
    a += (rand6()-1)*width;
  }

  int n = (int)a;
  a -= n; 
  a *= 7; width *= 7;
  return (n+1);
}

Run Code Online (Sandbox Code Playgroud)

不是一个真正的数学家,我会尽力解释这个算法是如何工作的:

在每次调用开始时rand7(),width比率为7 ^s/6 ^t,并且a是非负值,其属性a + width位于基本情况之后的区间[0,7]中.进入while循环时,width是可以添加的最大值a.如果floor(a + width)不同floor(a),则添加{0,width*1/6,width*1/3,width*1/2,width*2/3,width*5/6} 的随机选择a,并将指数t递增1(减少width6的幂的值.请注意,在迭代之后,a + width位于区间[0,7]中的属性保持不变.当width变得小于差值时ceil(a) - a,迭代停止.循环增加了更多的熵a,只要这样做实际上可以影响掷骰的结果,并且直观地说,这是使用基数6在[0,7]的范围内构建随机实数.离开环路后,模具辊被取出floor(a) + 1,并a减少到其分数部分.此时a + width位于区间[0,1).为下一个呼叫准备和保持不变的特性,既a和width由7:3的因子按比例放大(对width,这增加了指数s1).

以上解释了归纳步骤的工作原理.对基本案例的分析留作感兴趣的读者的练习.

当然,从效率的角度来看,浮点运算的使用会立即弹出作为性能阻力(假设性能rand6()已经足够并且本身无法改进).在保持这种算术编码算法的同时,删除浮点使用的最佳方法是什么？

Answer 1

ric*_*ici 11

跟进我做的评论,这是算法的定点版本.它使用无符号的4.60(也就是说,在数字的小数部分有60位),这比你可以得到的更多一些double:

int rand7fixed() {
    static uint64_t a = 0;
    static uint64_t width = 7UL<<60;
    static const uint64_t intmask = 0xfUL<<60;

    while (((a+width)&intmask) != (a&intmask)) {
      width /= 6;
      a += (rand6()-1)*width;
    }

    int n = a >> 60;
    a &=~intmask;
    a *= 7;
    width *= 7;
    return n+1;
}

Run Code Online (Sandbox Code Playgroud)

以上结果比OP中的双版本快约三分之一(参见下面的基准测试结果和注释).我怀疑时间浪费不是浮点运算,而是转换double为int.

正如R ..指出的那样,这并没有解决偏见; 它只是减少它.一个简单且合理快速的无偏算法将重复生成两个rand6()值h,l直到其中至少有一个为非零,然后返回h*6+l % 7:

int rand7_2() {
  int hi, lo;
  do {
    hi = rand6() - 1;
    lo = rand6() - 1;
  } while (hi + lo);
  return (hi * 6 + lo) % 7 + 1;
}

Run Code Online (Sandbox Code Playgroud)

如果您觉得需要减少调用次数rand6,则可以使用6 ¹²仅略大于7 ¹¹的事实,一次生成11个7个模具卷.为了消除偏差,仍然需要丢弃一些12个6辊组; 丢弃设置的频率将是或大约为1比11,因此平均每7转需要大约1.19个6卷.你可以通过使用25个6卷来生成23个7卷(每7卷有1.13个6卷)做得更好但是这不太适合64位算术,所以调用的边际优势会被削弱通过128位计算.(6¹²−7¹¹)/6¹²)rand6

这是11/12解决方案:

int rand7_12() {
    static int avail = 0;
    static uint32_t state = 0;
    static const uint32_t discard = 7*7*7*7*7*7*7*7*7*7*7; // 7 ** 11
    static const int out_per_round = 11;
    static const int in_per_round = 12;

    if (!avail) {
      do {
        state = rand6() - 1;
        for (int needed = in_per_round - 1; needed; --needed)
          state = state * 6 + rand6() - 1;
      }
      while (state >= discard);
      avail = out_per_round;
    }
    int rv = state % 7;
    state /= 7;
    --avail;
    return rv + 1;
}

Run Code Online (Sandbox Code Playgroud)

从理论上讲,你应该能够将比率降低到大约1.086.例如,你可以通过从972个6卷生成895个7卷,在1600个集合中丢弃大约一个,平均1.087个6卷/ 7卷,你可以做到这一点,但你需要2513位算术坚持国家.log₇6

我用一个非常精确的基准测试了所有四个函数,调用rand7 700,000,000次,然后打印结果的直方图.结果:

                                                  User time with
Algorithm       User time      rand6() calls     cycling rand6()
----------      ---------      -------------     ---------------
double          32.6 secs          760223193           13.2 secs
fixed           29.4 secs          760223194            7.9 secs
2 for 1         40.2 secs         1440004276
12 for 11       23.7 secs          840670008

Run Code Online (Sandbox Code Playgroud)

上面的基础rand6()实现是Gnu标准c ++库uniform_int_distribution<>(1,6)使用mt19937_64(64位Mersenne Twister)作为PRNG.为了更好地处理标准库中花费的时间,我还使用简单的循环计数器作为伪随机数生成器运行测试; 剩下的13.2秒和7.9秒代表(大致)在算法本身花费的时间,从中我们可以说定点算法的速度提高了大约40%.(很难在组合算法上获得良好的读数,因为固定序列使得分支预测更容易,并且减少了rand6调用的数量,但两者都花了不到5秒.)

最后,直方图,以防任何人想要检查偏差(还包括一个std::uniform_int_distribution(1,7)参考的运行):

Algorithm          1          2          3          4          5          6          7
---------  ---------  ---------  ---------  ---------  ---------  ---------  ---------
reference  100007522  100002456  100015800  100005923   99972185  100008908   99987206
double     100014597  100005975   99982219   99986299  100004561  100011049   99995300
fixed      100009603  100009639  100034790   99989935   99995502   99981886   99978645
2 for 1    100004476   99997766   99999521  100001382   99992802  100003868  100000185
12 for 11   99988156  100004974  100020070  100001912   99997472   99995015   99992401

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年前
查看次数：	457 次
最近记录：	8 年，1 月前