直接按升序计算数字因子而不进行排序?

Tod*_*man 14 algorithm primes enumeration factors

是否有一种有效的算法来按升序排列数n的因子而不进行排序?"有效"我的意思是:

  1. 该算法通过从n的素数幂分解开始,避免了对除数的强力搜索.

  2. 算法的运行时复杂度为O(d log 2 d)或更好,其中dn的除数.

  3. 算法的空间复杂度为O(d).

  4. 该算法避免了排序操作.也就是说,这些因素是按顺序生成的,而不是按顺序生成并随后排序.尽管使用简单的递归方法进行枚举然后排序是O(d log 2 d),但是对于排序中涉及的所有存储器访问来说,存在非常难看的成本.

一个简单的例子是n = 360 =2³×3²×5,其中d = 24个因子:{1,2,3,4,5,6,8,9,10,12,15,18,20,24, 30,36,40,45,60,72,90,120,180,360}.

一个更严重的例子是n = 278282512406132373381723386382308832000 =2⁸×3⁴×5³×7²×11²×13²×17×19×23×29×31×37×41×43×47×53×59×61×67×71×73 ×79,其中d = 318504960因子(显然这里列出太多了!).顺便提一下,这个数字具有最大数量的因子,最多可达2 ^ 128.

我可以发誓几周前我看到了这种算法的描述,带有示例代码,但现在我似乎无法在任何地方找到它.它使用了一些魔术技巧,在每个素数因子的输出列表中维护一个祖先索引列表.(更新:我使用汉明数字混淆因子生成,其运算方式类似.)

更新

我最终在运行时使用O(d)的解决方案,具有极低的内存开销,可以就地创建O(d)输出,并且比我所知的任何其他方法都要快得多.我已经发布了这个解决方案作为答案,使用C源代码.它是另一个贡献者Will Ness在Haskell中提供的一个优化算法的优化简化版本.我选择了Will的答案作为公认的答案,因为它提供了一个非常优雅的解决方案,符合最初规定的所有要求.

Wil*_*ess 6

我们可以合并多个流,生成所以首先没有重复.

从列表开始[1],对于每个唯一的素因子p,我们将列表乘以p迭代k次数(其中k是多重性p)以获得k新列表,并将它们与该传入列表合并在一起.

在哈斯克尔,

ordfactors n =          --   <----------------------<---<---<-----\
  foldr g [1]           -- g (19,1) (g (7,1) (g (3,2) (g (2,3) [1])))
    . reverse           -- [(19,1),(7,1),(3,2),(2,3)]              ^
    . primePowers       -- [(2,3),(3,2),(7,1),(19,1)]              |
    $ n                 -- 9576                    --->--->--->----/
       where
         g (p,k) xs = mergeAsTree 
                        . take (k+1)          -- take 3 [a,b,c,d..] = [a,b,c]
                        . iterate (map (p*))  -- iterate f x = [x, f x, f(f x),...]
                        $ xs

{-  g (2,3) [1] = let a0 = [1]        
                      a1 = map (2*) a0               -- [2]
                      a2 = map (2*) a1               -- [4]
                      a3 = map (2*) a2               -- [8]
                  in mergeAsTree [ a0, a1, a2, a3 ]  -- xs2 = [1,2,4,8]

    g (3,2) xs2 = let b0 = xs2                       -- [1,2,4,8]
                      b1 = map (3*) b0               -- [3,6,12,24]
                      b2 = map (3*) b1               -- [9,18,36,72]
                  in mergeAsTree [ b0, b1, b2 ]      -- xs3 = [1,2,3,4,6,8,9,12,...] 

    g (7,1) xs3 = mergeAsTree [ xs3, map (7*) xs3 ]  -- xs4 = [1,2,3,4,6,7,8,9,...]

    g (19,1) xs4 = mergeAsTree [ xs4, map (19*) xs4 ]  
-}
mergeAsTree xs   = foldi merge [] xs    -- [a,b]     --> merge a b
                                        -- [a,b,c,d] --> merge (merge a b) (merge c d)
foldi f z []     = z
foldi f z (x:xs) = f x (foldi f z (pairs f xs))
pairs f (x:y:t)  = f x y : pairs f t
pairs _ t        = t
Run Code Online (Sandbox Code Playgroud)

foldi为了速度,以树状的方式安排二进制merges(假设在合并的流中没有重复,用于简化操作); 而是以线性方式工作.foldr

primePowers n | n > 1 =           -- 9576 => [(2,3),(3,2),(7,1),(19,1)]
  map (\xs -> (head xs,length xs)) --                                ^
    . group                       -- [[2,2,2],[3,3],[7],[19]]        |
    . factorize                   -- [2,2,2,3,3,7,19]                |
    $ n                           -- 9576             --->--->--->---/
Run Code Online (Sandbox Code Playgroud)

group是一个标准函数,它将列表中的相邻等值组合在一起,factorize是一种众所周知的试验分割算法,它以非递减顺序生成数字的素数因子:

factorize n | n > 1 = go n (2:[3,5..])   -- 9576 = 2*2*2*3*3*7*19
   where                                 -- produce prime factors only
     go n (d:t)
        | d*d > n    = [n]
        | r == 0     =  d : go q (d:t)
        | otherwise  =      go n t
            where 
              (q,r)  = quotRem n d
factorize _ = []
Run Code Online (Sandbox Code Playgroud)

(.)是一个函数式合成运算符,将其参数函数的输出在其右侧作为输入链接到左侧的输出.它(和$)可以大声朗读为"of".


j_r*_*ker 5

简而言之:反复从堆中拉出下一个最小的因子,然后推回该因子的每个仍是 n 因子的倍数。使用技巧来避免出现重复项,以便堆大小永远不会超过 d。时间复杂度为 O(kd log d),其中 k 是不同质因数的数量。

我们利用的关键属性是,如果 x 和 y 都是 n 的因数,其中 y = x*p 对于某个因数 p >= 2 - 即,如果 x 的素因数是y——则 x < y。这意味着在将任何此类 y 插入堆之前输出 x 始终是安全的。堆仅有效地用于比较两个都不是另一个子集的因子。

第一次尝试:重复会减慢速度

首先描述一个产生正确答案但也会产生许多重复答案的算法会很有帮助:

  1. 设置上一个 = NULL。
  2. 将 1 插入堆 H 中。
  3. 从 H 中提取堆 t 的顶部。如果堆为空,则停止。
  4. 如果 t == prev 则转到 3。 [编辑:已修复]
  5. 输出t。
  6. 设置上一个= t。
  7. 对于 n 的每个不同质因数 p:
    • 如果 n % (t*p) == 0(即如果 t*p 仍然是 n 的因子),则将 t*p 推到 H 上。
  8. 转到3。

上述算法的唯一问题是它可以多次生成相同的因子。例如,如果 n = 30,则因子 15 将生成为因子 5 的子级(通过乘以素数因子 3),并且也将生成为因子 3 的子级(通过乘以 5)。解决这个问题的一种方法是注意,当任何重复项到达堆顶部时,必须在连续的块中读出它们,因此您可以简单地检查堆顶部是否等于刚刚提取的值,并保持如果是的话,提取并丢弃它。但更好的方法是可能的:

从源头删除重复项

因子x有多少种生成方式?首先考虑 x 不包含重数 > 1 的质因数的情况。在这种情况下,如果它包含 m 个不同的质因数,则有 m-1 个“父”因数将在前面的过程中将其生成为“子”算法——每个父母都由 m-1 个质因数的某个子集组成,剩余的质因数是添加到孩子的质因数。(如果 x 具有重数 > 1 的素因数,则实际上有 m 个父母。)如果我们有一种方法可以确定这些父母中的一个恰好是实际生成 x 作为孩子的“选择的一个”,并且该规则导致了一个测试,可以在弹出父级时应用于每个父级,然后我们可以避免首先创建任何重复项。

我们可以使用以下规则:对于任何给定的 x,选择缺少x 的 m 个因子中最大的一个的潜在父代 y 。这构成了一个简单的规则:当且仅当 x = y*p(对于某个 p 大于或等于 y 中已有的任何质因数)时,父代 y 才会生成子代 x。这很容易测试:只需按降序循环遍历质因数,为每个质因数生成子级,直到我们找到一个已经整除 y 的质因数。在前面的示例中,父级 3 将生成 15,但父级 5 不会(因为 3 < 5)——因此 15 确实只生成一次。对于 n = 30,完整的树如下所示:

       1
      /|\
     2 3 5
    /|  \
   6 10  15
   |
   30
Run Code Online (Sandbox Code Playgroud)

请注意,每个因子只生成一次。

新的无重复算法如下:

  1. 将 1 插入堆 H 中。
  2. 从 H 中提取堆 t 的顶部。如果堆为空,则停止。
  3. 输出t。
  4. 对于 n 的每个不同质因数 p按降序排列
    • 如果 n % (t*p) == 0(即如果 t*p 仍然是 n 的因子),则将 t*p 推到 H 上。
    • 如果 t % p == 0 (即如果 t 已经包含 p 作为因子)则停止。
  5. 转到2。


Tod*_*man 5

这个答案给出了一个C实现,我认为这是最快,最节省内存的。

算法概述。该算法基于Will Ness在另一个答案中引入的自下而上的合并方法,但经过进一步简化,因此要合并的列表实际上根本不在内存中的任何位置。整理每个列表的head元素并将其保留在一个小的数组中,同时根据需要即时构建列表的所有其他元素。这种“幻像列表”的使用(即正在运行的代码的想象力的缩影)极大地减少了内存占用量以及读写的内存访问量,还改善了空间局部性,从而显着提高了速度算法的 依次将每个级别的因子直接写入它们在输出数组中的最终静止位置。

基本思想是通过对素数功率因数分解的数学归纳法来产生因子。例如,要产生360的因数,需要计算72的因数,然后乘以5的相关乘方,在这种情况下为{1,5}。为了产生因子72,计算因子8,然后乘以3的相关乘方,在这种情况下为{1,3,9};为了产生因子8,将基本情况1乘以2的相关幂,在这种情况下为{1,2,4,8}。因此,在每个归纳步骤中,都会在一组现有因子和一组素数之间计算笛卡尔乘积,并将结果通过乘法还原为整数。

下图是数字360的图示。一行代表一个时间步。时间表示为垂直流动。

360除数

空间复杂度。产生因子的临时数据结构非常小:仅使用O(log?(n))空间,其中n是要生成因子的数字。例如,在C中此算法的128位实现中,诸如333,939,014,887,358,848,058,068,068,063,658,770,598,400(其基数为2的对数为127.97)的数字分配5.1 GB来存储其318,504,960因数的列表,但仅使用360 字节的额外开销产生那个清单。任何128位数字最多大约需要5 KB的开销。

运行时复杂度。运行时间完全取决于素数幂分解的指数(例如素数签名)。为了获得最佳结果,应首先处理最大指数,最后处理最小指数,以便在最后阶段通过低复杂度合并来控制运行时,在实践中,合并通常被认为是二进制合并。渐近运行时间为O(ÇÑdÑ)),其中dÑ)是的除数计数Ñ并且其中ÇÑ)是一些恒定依赖的主要签名Ñ。也就是说,与素数签名关联的每个等价类具有不同的常数。由小指数支配的素数签名具有较小的常数。由大指数支配的素数签名具有较大的常数。因此,运行时复杂性通过主要签名聚类。

图。在3.4 GHz上分析了运行时性能。英特尔®酷睿i7对66,591个n值进行采样,其中dn)个因子为dn),最大不超过1.6亿。此图的n最大值为14,550,525,518,294,259,162,294,162,737,840,640,000,在2.35秒内产生159,744,000个因子。

总运行时间

每秒产生的分类因子的数量本质上是上述条件的倒置。通过主签名进行聚类在数据中显而易见。L1,L2和L3高速缓存大小以及物理RAM限制也影响性能。

每秒产生的因素

源代码。下面附带的是C编程语言(特别是C11)的概念验证程序。尽管它在GCC上也可以正常运行,但已在带有Clang / LLVM的x86-64上进行了测试。

/*==============================================================================

DESCRIPTION

   This is a small proof-of-concept program to test the idea of generating the
   factors of a number in ascending order using an ultra-efficient sortless
   method.


INPUT

   Input is given on the command line, either as a single argument giving the
   number to be factored or an even number of arguments giving the 2-tuples that
   comprise the prime-power factorization of the desired number.  For example,
   the number

      75600 = 2^4 x 3^3 x 5^2 x 7

   can be given by the following list of arguments:

      2 4 3 3 5 2 7 1

   Note:  If a single number is given, it will require factoring to produce its
   prime-power factorization.  Since this is just a small test program, a very
   crude factoring method is used that is extremely fast for small prime factors
   but extremely slow for large prime factors.  This is actually fine, because
   the largest factor lists occur with small prime factors anyway, and it is the
   production of large factor lists at which this program aims to be proficient.
   It is simply not interesting to be fast at producing the factor list of a
   number like 17293823921105882610 = 2 x 3 x 5 x 576460797370196087, because
   it has only 32 factors.  Numbers with tens or hundreds of thousands of
   factors are much more interesting.


OUTPUT

   Results are written to standard output.  A list of factors in ascending order
   is produced, followed by runtime required to generate the list (not including
   time to print it).


AUTHOR

   Todd Lehman
   2015/05/10

*/

//-----------------------------------------------------------------------------
#include <inttypes.h>
#include <limits.h>
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <string.h>
#include <ctype.h>
#include <time.h>
#include <math.h>
#include <assert.h>

//-----------------------------------------------------------------------------
typedef  unsigned int  uint;
typedef  uint8_t       uint8;
typedef  uint16_t      uint16;
typedef  uint32_t      uint32;
typedef  uint64_t      uint64;
typedef  __uint128_t   uint128;

#define  UINT128_MAX  (uint128)(-1)

#define  UINT128_MAX_STRLEN  39

//-----------------------------------------------------------------------------
#define  ARRAY_CAPACITY(x)  (sizeof(x) / sizeof((x)[0]))

//-----------------------------------------------------------------------------
// This structure encode a single prime-power pair for the prime-power
// factorization of numbers, for example 3 to the 4th power.

#pragma pack(push, 8)
typedef struct
{
  uint128  p;  // Prime.
  uint8    e;  // Power (exponent).
}
PrimePower;   // 24 bytes using 8-byte packing
#pragma pack(pop)

//-----------------------------------------------------------------------------
// Prime-power factorization structure.
//
// This structure is sufficient to represent the prime-power factorization of
// all 128-bit values.  The field names ? and ? are dervied from the standard
// number theory functions ?(n) and ?(n), which count the number of unique and
// non-unique prime factors of n, respectively.  The field name d is derived
// from the standard number theory function d(n), which counts the number of
// divisors of n, including 1 and n.
//
// The maximum possible value here of ? is 26, which occurs at
// n = 232862364358497360900063316880507363070 = 2 x 3 x 5 x 7 x 11 x 13 x 17 x
// 19 x 23 x 29 x 31 x 37 x 41 x 43 x 47 x 53 x 59 x 61 x 67 x 71 x 73 x 79 x
// 83 x 89 x 97 x 101, which has 26 unique prime factors.
//
// The maximum possible value of ? here is 127, which occurs at n = 2^127 and
// n = 2^126 x 3, both of which have 127 non-unique prime factors.
//
// The maximum possible value of d here is 318504960, which occurs at
// n = 333939014887358848058068063658770598400 = 2^9 x 3^5 x 5^2 x 7^2 x 11^2 x
// 13^2 x 17 x 19 x 23 x 29 x 31 x 37 x 41 x 43 x 47 x 53 x 59 x 61 x 67 x 71 x
// 73 x 79.
//
#pragma pack(push, 8)
typedef struct
{
  PrimePower  f[32];  // Primes and their exponents.
  uint8       ?;      // Count of prime factors without multiplicity.
  uint8       ?;      // Count of prime factors with multiplicity.
  uint32      d;      // Count of factors of n, including 1 and n.
  uint128     n;      // Value of n on which all other fields depend.
}
PrimePowerFactorization;  // 656 bytes using 8-byte packing
#pragma pack(pop)

#define  MAX_?  26
#define  MAX_?  127

//-----------------------------------------------------------------------------
// Fatal error:  print error message and abort.

void fatal_error(const char *format, ...)
{
  va_list args;
  va_start(args, format);
  vfprintf(stderr, format, args);
  exit(1);
}

//------------------------------------------------------------------------------
uint128 uint128_from_string(const char *const str)
{
  assert(str != NULL);

  uint128 n = 0;
  for (int i = 0; isdigit(str[i]); i++)
    n = (n * 10) + (uint)(str[i] - '0');

  return n;
}

//------------------------------------------------------------------------------
void uint128_to_string(const uint128 n,
                       char *const strbuf, const uint strbuflen)
{
  assert(strbuf != NULL);
  assert(strbuflen >= UINT128_MAX_STRLEN + 1);

  // Extract digits into string buffer in reverse order.
  uint128 a = n;
  char *s = strbuf;
  do { *(s++) = '0' + (uint)(a % 10); a /= 10; } while (a != 0);
  *s = '\0';

  // Reverse the order of the digits.
  uint l = strlen(strbuf);
  for (uint i = 0; i < l/2; i++)
    { char t = strbuf[i]; strbuf[i] = strbuf[l-1-i]; strbuf[l-1-i] = t; }

  // Verify result.
  assert(uint128_from_string(strbuf) == n);
}

//------------------------------------------------------------------------------
char *uint128_to_static_string(const uint128 n, const uint i)
{
  static char str[2][UINT128_MAX_STRLEN + 1];
  assert(i < ARRAY_CAPACITY(str));
  uint128_to_string(n, str[i], ARRAY_CAPACITY(str[i]));
  return str[i];
}

//------------------------------------------------------------------------------
// Compute sorted list of factors, given a prime-power factorization.

uint128 *compute_factors(const PrimePowerFactorization ppf)
{
  const uint128 n =       ppf.n;
  const uint    d = (uint)ppf.d;
  const uint    ? = (uint)ppf.?;

  if (n == 0)
    return NULL;

  uint128 *factors = malloc((d + 1) * sizeof(*factors));
  if (!factors)
    fatal_error("Failed to allocate array of %u factors.", d);
  uint128 *const factors_end = &factors[d];


  // --- Seed the factors[] array.

  factors_end[0] = 0;   // Dummy value to simplify looping in bottleneck code.
  factors_end[-1] = 1;  // Seed value.

  if (n == 1)
    return factors;


  // --- Iterate over all prime factors.

  uint range = 1;
  for (uint i = 0; i < ?; i++)
  {
    const uint128 p = ppf.f[i].p;
    const uint    e = ppf.f[i].e;

    // --- Initialize phantom input lists and output list.
    assert(e < 128);
    assert(range < d);
    uint128 *restrict in[128];
    uint128 pe[128], f[128];
    for (uint j = 0; j <= e; j++)
    {
      in[j] = &factors[d - range];
      pe[j] = (j == 0)? 1 : pe[j-1] * p;
      f[j] = pe[j];
    }
    uint active_list_count = 1 + e;
    range *= 1 + e;
    uint128 *restrict out = &factors[d - range];

    // --- Merge phantom input lists to output list, until all input lists are
    //     extinguished.
    while (active_list_count > 0)
    {
      if (active_list_count == 1)
      {
        assert(out == in[0]);
        while (out != factors_end)
          *(out++) *= pe[0];
        in[0] = out;
      }
      else if (active_list_count == 2)
      {
        // This section of the code is the bottleneck of the entire factor-
        // producing algorithm.  Other portions need to be fast, but this
        // *really* needs to be fast; therefore, it has been highly optimized.
        // In fact, it is by far most frequently the case here that pe[0] is 1,
        // so further optimization is warranted in this case.
        uint128 f0 = f[0], f1 = f[1];
        uint128 *in0 = in[0], *in1 = in[1];
        const uint128 pe0 = pe[0], pe1 = pe[1];

        if (pe[0] == 1)
        {
          while (true)
          {
            if (f0 < f1)
              { *(out++) = f0; f0 = *(++in0);
                if (in0 == factors_end) break; }
            else
              { *(out++) = f1; f1 = *(++in1) * pe1; }
          }
        }
        else
        {
          while (true)
          {
            if (f0 < f1)
              { *(out++) = f0; f0 = *(++in0) * pe0;
                if (in0 == factors_end) break; }
            else
              { *(out++) = f1; f1 = *(++in1) * pe1; }
          }
        }

        f[0] = f0; f[1] = f1;
        in[0] = in0; in[1] = in1;
      }
      else if (active_list_count == 3)
      {
        uint128 f0 = f[0], f1 = f[1], f2 = f[2];
        uint128 *in0 = in[0], *in1 = in[1], *in2 = in[2];
        const uint128 pe0 = pe[0], pe1 = pe[1], pe2 = pe[2];

        while (true)
        {
          if (f0 < f1)
          {
            if (f0 < f2)
              { *(out++) = f0; f0 = *(++in0) * pe0;
                if (in0 == factors_end) break; }
            else
              { *(out++) = f2; f2 = *(++in2) * pe2; }
          }
          else
          {
            if (f1 < f2)
              { *(out++) = f1; f1 = *(++in1) * pe1; }
            else
              { *(out++) = f2; f2 = *(++in2) * pe2; }
          }
        }

        f[0] = f0; f[1] = f1, f[2] = f2;
        in[0] = in0; in[1] = in1, in[2] = in2;
      }
      else if (active_list_count >= 3)
      {
        while (true)
        {
          // Chose the smallest multiplier.
          uint k_min = 0;
          uint128 f_min = f[0];
          for (uint k = 0; k < active_list_count; k++)
            if (f[k] < f_min)
              { f_min = f[k]; k_min = k; }

          // Write the output factor, advance the input pointer, and
          // produce a new factor in the array f[] of list heads.
          *(out++) = f_min;
          f[k_min] = *(++in[k_min]) * pe[k_min];
          if (in[k_min] == factors_end)
            { assert(k_min == 0); break; }
        }
      }

      // --- Remove the newly emptied phantom input list.  Note that this is
      //     guaranteed *always* to be the first remaining non-empty list.
      assert(in[0] == factors_end);
      for (uint j = 1; j < active_list_count; j++)
      {
        in[j-1] = in[j];
        pe[j-1] = pe[j];
         f[j-1] =  f[j];
      }
      active_list_count -= 1;
    }

    assert(out == factors_end);
  }


  // --- Validate array of sorted factors.
  #ifndef NDEBUG
  {
    for (uint k = 0; k < d; k++)
    {
      if (factors[k] == 0)
        fatal_error("Produced a factor of 0 at index %u.", k);

      if (n % factors[k] != 0)
        fatal_error("Produced non-factor %s at index %u.",
                    uint128_to_static_string(factors[k], 0), k);

      if ((k > 0) && (factors[k-1] == factors[k]))
        fatal_error("Duplicate factor %s at index %u.",
                    uint128_to_static_string(factors[k], 0), k);

      if ((k > 0) && (factors[k-1] > factors[k]))
        fatal_error("Out-of-order factors %s and %s at indexes %u and %u.",
                    uint128_to_static_string(factors[k-1], 0),
                    uint128_to_static_string(factors[k], 1),
                    k-1, k);
    }
  }
  #endif


  return factors;
}

//------------------------------------------------------------------------------
// Print prime-power factorization of a number.

void print_ppf(const PrimePowerFactorization ppf)
{
  printf("%s = ", uint128_to_static_string(ppf.n, 0));
  if (ppf.n == 0)
  {
    printf("0");
  }
  else
  {
    for (uint i = 0; i < ppf.?; i++)
    {
      if (i > 0)
        printf(" x ");

      printf("%s", uint128_to_static_string(ppf.f[i].p, 0));

      if (ppf.f[i].e > 1)
        printf("^%"PRIu8"", ppf.f[i].e);
    }
  }
  printf("\n");
}

//------------------------------------------------------------------------------
int compare_powers_ascending(const void *const pf1,
                             const void *const pf2)
{
  const PrimePower f1 = *((const PrimePower *)pf1);
  const PrimePower f2 = *((const PrimePower *)pf2);

  return  (f1.e < f2.e)?  -1:
          (f1.e > f2.e)?  +1:
                           0;  // Not an error; duplicate exponents are common.
}

//------------------------------------------------------------------------------
int compare_powers_descending(const void *const pf1,
                              const void *const pf2)
{
  const PrimePower f1 = *((const PrimePower *)pf1);
  const PrimePower f2 = *((const PrimePower *)pf2);

  return  (f1.e < f2.e)?  +1:
          (f1.e > f2.e)?  -1:
                           0;  // Not an error; duplicate exponents are common.
}

//------------------------------------------------------------------------------
int compare_primes_ascending(const void *const pf1,
                             const void *const pf2)
{
  const PrimePower f1 = *((const PrimePower *)pf1);
  const PrimePower f2 = *((const PrimePower *)pf2);

  return  (f1.p < f2.p)?  -1:
          (f1.p > f2.p)?  +1:
                           0;  // Error; duplicate primes must never occur.
}

//------------------------------------------------------------------------------
int compare_primes_descending(const void *const pf1,
                              const void *const pf2)
{
  const PrimePower f1 = *((const PrimePower *)pf1);
  const PrimePower f2 = *((const PrimePower *)pf2);

  return  (f1.p < f2.p)?  +1:
          (f1.p > f2.p)?  -1:
                           0;  // Error; duplicate primes must never occur.
}

//------------------------------------------------------------------------------
// Sort prime-power factorization.

void sort_ppf(PrimePowerFactorization *const ppf,
              const bool primes_major,      // Best false
              const bool primes_ascending,  // Best false
              const bool powers_ascending)  // Best false
{
  int (*compare_primes)(const void *, const void *) =
    primes_ascending? compare_primes_ascending : compare_primes_descending;

  int (*compare_powers)(const void *, const void *) =
    powers_ascending? compare_powers_ascending : compare_powers_descending;

  if (primes_major)
  {
    mergesort(ppf->f, ppf->?, sizeof(ppf->f[0]), compare_powers);
    mergesort(ppf->f, ppf->?, sizeof(ppf->f[0]), compare_primes);
  }
  else
  {
    mergesort(ppf->f, ppf->?, sizeof(ppf->f[0]), compare_primes);
    mergesort(ppf->f, ppf->?, sizeof(ppf->f[0]), compare_powers);
  }
}

//------------------------------------------------------------------------------
// Compute prime-power factorization of a 128-bit value.  Note that this
// function is designed to be fast *only* for numbers with very simple
// factorizations, e.g., those that produce large factor lists.  Do not attempt
// to factor large semiprimes with this function.  (The author does know how to
// factor large numbers efficiently; however, efficient factorization is beyond
// the scope of this small test program.)

PrimePowerFactorization compute_ppf(const uint128 n)
{
  PrimePowerFactorization ppf;

  if (n == 0)
  {
    ppf = (PrimePowerFactorization){ .? = 0, .? = 0, .d = 0, .n = 0 };
  }
  else if (n == 1)
  {
    ppf = (PrimePowerFactorization){ .f[0] = { .p = 1, .e = 1 },
                                     .? = 1, .? = 1, .d = 1, .n = 1 };
  }
  else
  {
    ppf = (PrimePowerFactorization){ .? = 0, .? = 0, .d = 1, .n = n };

    uint128 m = n;
    for (uint128 p = 2; p * p <= m; p += 1 + (p > 2))
    {
      if (m % p == 0)
      {
        assert(ppf.? <= MAX_?);
        ppf.f[ppf.?].p = p;
        ppf.f[ppf.?].e = 0;
        while (m % p == 0)
          { m /= p; ppf.f[ppf.?].e += 1; }
        ppf.d *= (1 + ppf.f[ppf.?].e);
        ppf.? += ppf.f[ppf.?].e;
        ppf.? += 1;
      }
    }
    if (m > 1)
    {
      assert(ppf.? <= MAX_?);
      ppf.f[ppf.?].p = m;
      ppf.f[ppf.?].e = 1;
      ppf.d *= 2;
      ppf.? += 1;
      ppf.? += 1;
    }
  }

  return ppf;
}

//------------------------------------------------------------------------------
// Parse prime-power factorization from a list of ASCII-encoded base-10 strings.
// The values are assumed to be 2-tuples (p,e) of prime p and exponent e.
// Primes must not exceed 2^128 - 1 and must not be repeated.  Exponents must
// not exceed 2^8 - 1, but can of course be repeated.  The constructed value
// must not exceed 2^128 - 1.

PrimePowerFactorization parse_ppf(const uint pairs, const char *const values[])
{
  assert(pairs <= MAX_?);

  PrimePowerFactorization ppf;

  if (pairs == 0)
  {
    ppf = (PrimePowerFactorization){ .? = 0, .? = 0, .d = 0, .n = 0 };
  }
  else
  {
    ppf = (PrimePowerFactorization){ .? = 0, .? = 0, .d = 1, .n = 1 };

    for (uint i = 0; i < pairs; i++)
    {
      ppf.f[i].p = uint128_from_string(values[(i*2)+0]);
      ppf.f[i].e = (uint8)strtoumax(values[(i*2)+1], NULL, 10);

      // Validate prime value.
      if (ppf.f[i].p < 2)  // (Ideally this would actually do a primality test.)
        fatal_error("Factor %s is invalid.",
                    uint128_to_static_string(ppf.f[i].p, 0));

      // Accumulate count of unique prime factors.
      if (ppf.? > UINT8_MAX - 1)
        fatal_error("Small-omega overflow at factor %s^%"PRIu8".",
                    uint128_to_static_string(ppf.f[i].p, 0), ppf.f[i].e);
      ppf.? += 1;

      // Accumulate count of total prime factors.
      if (ppf.? > UINT8_MAX - ppf.f[i].e)
        fatal_error("Big-omega wverflow at factor %s^%"PRIu8".",
                    uint128_to_static_string(ppf.f[i].p, 0), ppf.f[i].e);
      ppf.? += ppf.f[i].e;

      // Accumulate total divisor count.
      if (ppf.d > UINT32_MAX / (1 + ppf.f[i].e))
        fatal_error("Divisor c