有谁能比这更快排序吗?

-8 sorting algorithm

我能够为整数编写更快的排序!
\n它的排序速度比生成数组的速度快。它的工作原理是声明一个数组的长度等于要排序的整数数组的最大值并初始化为零。然后,使用其值作为计数数组的索引来循环要排序的数组 - 每次遇到该值时计数数组都会递增。随后,循环计数数组并将其计数次数的索引按顺序分配给输入/输出数组。
\n代码如下:

\n
SUBROUTINE icountSORT(arrA, nA)\n  ! This is a count sort.  It counts the frequency of\n  ! each element in the integer array to be sorted using\n  ! an array with a length of MAXVAL(arrA)+1 such that\n  ! 0\'s are counted at index 1, 1\'s are counted at index 2,\n  ! etc.\n  !\n  ! ~ Derrel Walters\n  IMPLICIT NONE\n\n  INTEGER(KIND=8),INTENT(IN) :: nA\n  INTEGER(KIND=8),DIMENSION(nA),INTENT(INOUT) :: arrA\n\n  INTEGER(KIND=8),ALLOCATABLE,DIMENSION(:) :: arrB\n  INTEGER(KIND=8) :: i, j, k, maxA\n  INTEGER ::  iStat\n\n  maxA = MAXVAL(arrA)\n  ALLOCATE(arrB(maxA+1),STAT=iStat)\n\n  arrB = 0\n\n  DO i = 1, nA\n    arrB(arrA(i)+1) = arrB(arrA(i)+1) + 1\n  END DO\n\n  k = 1\n  DO i = 1, SIZE(arrB)\n    DO j = 1, arrB(i)\n      arrA(k) = i - 1\n      k = k + 1\n    END DO\n  END DO\n\nEND SUBROUTINE icountSORT\n
Run Code Online (Sandbox Code Playgroud)\n

发布更多证据。 nlogn 预测在大数组大小时执行时间太长。 此外,在本问题末尾附近发布的 Fortran 程序将数组(未排序和已排序)写入文件并发布写入和排序时间。 文件写入是一个已知的 O(n) 过程。排序的运行速度比文件一直写入最大数组的速度要快。如果排序以 O(nlogn) 运行,则在某个时刻,排序时间将跨越写入时间,并且在较大数组大小时变得更长。 因此,已证明该排序例程的执行时间复杂度为 O(n)。

\n

我在本文底部添加了一个完整的 Fortran 程序进行编译,以便可以重现输出。执行时间是线性的。

\n

在 Win 10 的 Debian 环境中使用以下代码以更清晰的格式获得更多计时数据:

\n
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ for (( i=100000; i<=50000000; i=2*i )); do ./derrelSORT-example.py $i; done | awk  \'BEGIN {print "N      Time(s)"}; {if ($1=="Creating") {printf $4" "} else if ($1=="Sorting" && $NF=="seconds") {print $3}}\'\nN      Time(s)\n100000 0.01\n200000 0.02\n400000 0.04\n800000 0.08\n1600000 0.17\n3200000 0.35\n6400000 0.76\n12800000 1.59\n25600000 3.02\n
Run Code Online (Sandbox Code Playgroud)\n

此代码相对于元素数量线性执行(此处给出整数示例)。它通过随着(合并)排序的进行以指数方式增加已排序块的大小来实现这一点。为了促进指数增长的块:

\n
    \n
  1. 在排序开始之前需要计算迭代次数
  2. \n
  3. 需要为块导出索引转换(具体语言取决于索引协议)以传递到 merge()
  4. \n
  5. 当块大小不能被 2 的幂整除时,优雅地处理列表尾部的余数
  6. \n
\n

考虑到这些事情并开始,传统上,通过合并单值数组对,合并的块可以从 2 增长到 4 到 8 到 16 到 --- 到 2^n。这个单一案例是一个例外,它打破了比较排序的 O(nlogn) 时间复杂度的速度限制。该例程根据要排序的元素数量进行线性排序。

\n

谁能排序得更快吗?;)

\n

Fortran 代码(derrelSort.f90):

\n
! Derrel Walters \xc2\xa9 2019\n! These sort routines were written by Derrel Walters ~ 2019-01-23\n\n\nSUBROUTINE iSORT(arrA, nA)\n  ! This implementation of derrelSORT is for integers,\n  ! but the same principles apply for other datatypes.\n  !\n  ! ~ Derrel Walters\n  IMPLICIT NONE\n\n  INTEGER(KIND=8),INTENT(IN) :: nA\n  INTEGER,DIMENSION(nA),INTENT(INOUT) :: arrA\n\n  INTEGER,DIMENSION(nA) :: arrB\n  INTEGER(KIND=8) :: lowIDX, highIDX, midIDX\n  INTEGER ::  iStat\n  INTEGER(KIND=8) :: i, j, A, B, C, thisHigh, mergeSize, nLoops\n  INTEGER,DIMENSION(:),ALLOCATABLE :: iterMark\n  LOGICAL,DIMENSION(:),ALLOCATABLE :: moreToGo\n\n  arrB = arrA\n  mergeSize = 2\n  lowIDX = 1 - mergeSize\n  highIDX = 0\n\n  nLoops = INT(LOG(REAL(nA))/LOG(2.0))\n  ALLOCATE(iterMark(nLoops), moreToGo(nLoops), STAT=iStat)\n  moreToGo = .FALSE.\n  iterMark = 0\n\n  DO i = 1, nLoops\n    iterMark(i) = FLOOR(REAL(nA)/2**i)\n    IF (MOD(nA, 2**i) > 0) THEN\n      moreToGo(i) = .TRUE.\n      iterMark(i) = iterMark(i) + 1\n    END IF\n  END DO\n\n  DO i = 1, nLoops\n      DO j = 1, iterMark(i)\n        A = 0\n        B = 1\n        C = 0\n        lowIDX = lowIDX + mergeSize\n        highIDX = highIDX + mergeSize\n        midIDX = (lowIDX + highIDX + 1) / 2\n        thisHigh = highIDX\n        IF (j == iterMark(i).AND.moreToGo(i)) THEN\n          lowIDX = lowIDX - mergeSize\n          highIDX = highIDX - mergeSize\n          midIDX = (lowIDX + highIDX + 1) / 2\n          A = midIDX - lowIDX\n          B = 2\n          C = nA - 2*highIDX + midIDX - 1\n          thisHigh = nA\n        END IF\n        CALL imerge(arrA(lowIDX:midIDX-1+A), B*(midIDX-lowIDX),    &\n                    arrA(midIDX+A:thisHigh), highIDX-midIDX+1+C,   &\n                    arrB(lowIDX:thisHigh), thisHigh-lowIDX+1)\n        arrA(lowIDX:thisHigh) = arrB(lowIDX:thisHigh)\n      END DO\n      mergeSize = 2*mergeSize\n      lowIDX = 1 - mergeSize\n      highIDX = 0\n  END DO\n\nEND SUBROUTINE iSORT\n\nSUBROUTINE imerge(arrA, nA, arrB, nB, arrC, nC)\n  ! This merge is a faster merge.  Array A arrives\n  ! just to the left of Array B, and Array C is\n  ! filled from both ends simultaneously - while\n  ! still preserving the stability of the sort.\n  ! The derrelSORT routine is so fast, that\n  ! the merge does not affect the O(n) time\n  ! complexity of the sort in practice\n  !\n  ! ~ Derrel Walters\n  IMPLICIT NONE\n\n  INTEGER(KIND=8),INTENT(IN) :: nA, nB , nC\n\n  INTEGER,DIMENSION(nA),INTENT(IN) :: arrA\n  INTEGER,DIMENSION(nB),INTENT(IN) :: arrB\n  INTEGER,DIMENSION(nC),INTENT(INOUT) :: arrC\n\n  INTEGER(KIND=8) :: i, j, k, x, y, z\n\n  arrC = 0\n  i = 1\n  j = 1\n  k = 1\n  x = nA\n  y = nB\n  z = nC\n\n  DO\n    IF (i > x .OR. j > y) EXIT\n    IF (arrB(j) < arrA(i)) THEN\n      arrC(k) = arrB(j)\n      j = j + 1\n    ELSE\n      arrC(k) = arrA(i)\n      i = i + 1\n    END IF\n    IF (arrA(x) > arrB(y)) THEN\n      arrC(z) = arrA(x)\n      x = x - 1\n    ELSE\n      arrC(z) = arrB(y)\n      y = y - 1\n    END IF\n    k = k + 1\n    z = z - 1\n  END DO\n\n  IF (i <= x) THEN\n    DO\n      IF (i > x) EXIT\n        arrC(k) = arrA(i)\n        i = i + 1\n        k = k + 1\n    END DO\n  ELSEIF (j <= y) THEN\n    DO\n      IF (j > y) EXIT\n        arrC(k) = arrB(j)\n        j = j + 1\n        k = k + 1\n    END DO\n  END IF\nEND SUBROUTINE imerge\n
Run Code Online (Sandbox Code Playgroud)\n

次使用 f2py3 将上述 fortran 文件 (derrelSORT.f90) 转换为可在 python 中调用的文件。以下是 python 代码及其生成的时间 (derrelSORT-example.py):

\n
#!/bin/python3\n\nimport numpy as np\nimport derrelSORT as dS\nimport time as t\nimport random as rdm\nimport sys\n\ntry:\n  array_len = int(sys.argv[1])\nexcept IndexError:\n  array_len = 100000000\n\n# Create an array with array_len elements\nprint(50*\'-\')\nprint("Creating array of", array_len, "random integers.")\nt0 = t.time()\nx = np.asfortranarray(np.array([round(100000*rdm.random(),0)\n                      for i in range(array_len)]).astype(np.int32))\nt1 = t.time()\nprint(\'Creation time:\', round(t1-t0, 2), \'seconds\')\n\n\n# Sort the array using derrelSORT\nprint("Sorting the array with derrelSORT.")\nt0 = t.time()\ndS.isort(x, len(x))\nt1 = t.time()\nprint(\'Sorting time:\', round(t1-t0, 2), \'seconds\')\nprint(50*\'-\')\n
Run Code Online (Sandbox Code Playgroud)\n

命令行的输出。请注意时间。

\n
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 1000000\n--------------------------------------------------\nCreating array of 1000000 random integers.\nCreation time: 0.78 seconds\nSorting the array with derrelSORT.\nSorting time: 0.1 seconds\n--------------------------------------------------\ndwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 10000000\n--------------------------------------------------\nCreating array of 10000000 random integers.\nCreation time: 8.1 seconds\nSorting the array with derrelSORT.\nSorting time: 1.07 seconds\n--------------------------------------------------\ndwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 20000000\n--------------------------------------------------\nCreating array of 20000000 random integers.\nCreation time: 15.73 seconds\nSorting the array with derrelSORT.\nSorting time: 2.21 seconds\n--------------------------------------------------\ndwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 40000000\n--------------------------------------------------\nCreating array of 40000000 random integers.\nCreation time: 31.64 seconds\nSorting the array with derrelSORT.\nSorting time: 4.39 seconds\n--------------------------------------------------\ndwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 80000000\n--------------------------------------------------\nCreating array of 80000000 random integers.\nCreation time: 64.03 seconds\nSorting the array with derrelSORT.\nSorting time: 8.92 seconds\n--------------------------------------------------\ndwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ ./derrelSORT-example.py 160000000\n--------------------------------------------------\nCreating array of 160000000 random integers.\nCreation time: 129.56 seconds\nSorting the array with derrelSORT.\nSorting time: 18.04 seconds\n--------------------------------------------------\n
Run Code Online (Sandbox Code Playgroud)\n

更多输出:

\n
dwalters@Lapper3:~/PROGRAMMING/DATA-WATER$ for (( i=100000; i<=500000000; i=2*i )); do\n> ./derrelSORT-example.py $i\n> done\n--------------------------------------------------\nCreating array of 100000 random integers.\nCreation time: 0.08 seconds\nSorting the array with derrelSORT.\nSorting time: 0.01 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 200000 random integers.\nCreation time: 0.16 seconds\nSorting the array with derrelSORT.\nSorting time: 0.02 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 400000 random integers.\nCreation time: 0.32 seconds\nSorting the array with derrelSORT.\nSorting time: 0.04 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 800000 random integers.\nCreation time: 0.68 seconds\nSorting the array with derrelSORT.\nSorting time: 0.08 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 1600000 random integers.\nCreation time: 1.25 seconds\nSorting the array with derrelSORT.\nSorting time: 0.15 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 3200000 random integers.\nCreation time: 2.57 seconds\nSorting the array with derrelSORT.\nSorting time: 0.32 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 6400000 random integers.\nCreation time: 5.23 seconds\nSorting the array with derrelSORT.\nSorting time: 0.66 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 12800000 random integers.\nCreation time: 10.09 seconds\nSorting the array with derrelSORT.\nSorting time: 1.35 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 25600000 random integers.\nCreation time: 20.25 seconds\nSorting the array with derrelSORT.\nSorting time: 2.74 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 51200000 random integers.\nCreation time: 41.84 seconds\nSorting the array with derrelSORT.\nSorting time: 5.62 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 102400000 random integers.\nCreation time: 93.19 seconds\nSorting the array with derrelSORT.\nSorting time: 11.49 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 204800000 random integers.\nCreation time: 167.55 seconds\nSorting the array with derrelSORT.\nSorting time: 24.13 seconds\n--------------------------------------------------\n--------------------------------------------------\nCreating array of 409600000 random integers.\nCreation time: 340.84 seconds\nSorting the array with derrelSORT.\nSorting time: 47.21 seconds\n--------------------------------------------------\n
Run Code Online (Sandbox Code Playgroud)\n

当数组大小加倍时,时间也会加倍 - 正如所示。因此,米歇尔先生的初步评估是错误的。原因是,虽然外部循环确定每个块大小(即 log2(n))的循环数,但内部循环计数器随着排序的进行呈指数减少。然而,众所周知的证据就是布丁。时间清楚地表明了线性关系。

\n

如果有人需要任何帮助来重现结果,请告诉我。我很乐意提供帮助。

\n

本文末尾的 Fortran 程序是我在 2019 年编写的 Fortran 程序的原样副本。它旨在在命令行上使用。编译它:

\n
    \n
  1. 将 fortran 代码复制到扩展名为 .f90 的文件中
  2. \n
  3. 使用命令编译代码,例如:
  4. \n
\n
gfortran -o derrelSORT-ex.x derrelSORT.f90\n
Run Code Online (Sandbox Code Playgroud)\n
    \n
  1. 授予自己运行可执行文件的权限:
  2. \n
\n
chmod u+x derrelSORT-ex.x\n
Run Code Online (Sandbox Code Playgroud)\n
    \n
  1. 使用或不使用整数参数从命令行执行程序:
  2. \n
\n
./derrelSORT-ex.x\n
Run Code Online (Sandbox Code Playgroud)\n

或者

\n
./derrelSORT-ex.x 10000000\n
Run Code Online (Sandbox Code Playgroud)\n

输出应该如下所示(这里,我使用了 bash C 风格的循环来重复调用该命令)。 请注意,随着每次迭代数组大小加倍,执行时间也会加倍。

\n
SORT-RESEARCH$ for (( i=100000; i<500000000; i=2*i )); do\n> ./derrelSORT-2022.x $i\n> done\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:           100000\nTime =    0.0000 seconds\nWriting Array to rand-in.txt:\nTime =    0.0312 seconds\nSorting the Array\nTime =    0.0156 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    0.0469 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:           200000\nTime =    0.0000 seconds\nWriting Array to rand-in.txt:\nTime =    0.0625 seconds\nSorting the Array\nTime =    0.0312 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    0.0312 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:           400000\nTime =    0.0156 seconds\nWriting Array to rand-in.txt:\nTime =    0.1250 seconds\nSorting the Array\nTime =    0.0625 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    0.0938 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:           800000\nTime =    0.0156 seconds\nWriting Array to rand-in.txt:\nTime =    0.2344 seconds\nSorting the Array\nTime =    0.1406 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    0.2031 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:          1600000\nTime =    0.0312 seconds\nWriting Array to rand-in.txt:\nTime =    0.4219 seconds\nSorting the Array\nTime =    0.2969 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    0.3906 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:          3200000\nTime =    0.0625 seconds\nWriting Array to rand-in.txt:\nTime =    0.8281 seconds\nSorting the Array\nTime =    0.6562 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    0.7969 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:          6400000\nTime =    0.0938 seconds\nWriting Array to rand-in.txt:\nTime =    1.5938 seconds\nSorting the Array\nTime =    1.3281 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    1.6406 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:         12800000\nTime =    0.2500 seconds\nWriting Array to rand-in.txt:\nTime =    3.3906 seconds\nSorting the Array\nTime =    2.7031 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    3.2656 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:         25600000\nTime =    0.4062 seconds\nWriting Array to rand-in.txt:\nTime =    6.6250 seconds\nSorting the Array\nTime =    5.6094 seconds\nWriting Array to rand-sorted-out.txt:\nTime =    6.5312 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:         51200000\nTime =    0.8281 seconds\nWriting Array to rand-in.txt:\nTime =   13.2656 seconds\nSorting the Array\nTime =   11.5000 seconds\nWriting Array to rand-sorted-out.txt:\nTime =   13.1719 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:        102400000\nTime =    1.6406 seconds\nWriting Array to rand-in.txt:\nTime =   26.3750 seconds\nSorting the Array\nTime =   23.3438 seconds\nWriting Array to rand-sorted-out.txt:\nTime =   27.0625 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:        204800000\nTime =    3.3438 seconds\nWriting Array to rand-in.txt:\nTime =   53.1094 seconds\nSorting the Array\nTime =   47.3750 seconds\nWriting Array to rand-sorted-out.txt:\nTime =   52.8906 seconds\n\n\nDerrel Walters \xc2\xa9 2019\n\nDemonstrating derrelSORT\xc2\xa9\nWARNING: This program can produce LARGE files!\n\nGenerating random array of length:        409600000\nTime =    6.6562 seconds\nWriting Array to rand-in.txt:\nTime =  105.1875 seconds\nSorting the Array\nTime =   99.5938 seconds\nWriting Array to rand-sorted-out.txt:\nTime =  109.9062 seconds\n
Run Code Online (Sandbox Code Playgroud)\n

这是 2019 年的程序,未经修改:

\n
SORT-RESEARCH$ cat derrelSORT.f90\n! Derrel Walters \xc2\xa9 2019\n! These sort routines were written by Derrel Walters ~ 2019-01-23\n\nPROGRAM sort_test\n  ! This program demonstrates a linear sort routine\n  ! by generating a random array (here integer), writing it\n  ! to a file \'rand-in.txt\', sorting it with an\n  ! implementation of derrelSORT (here for integers -\n  ! where the same principles apply for other applicable\n  ! datatypes), and finally, printing the sorted array\n  ! to a file \'rand-sorted-out.txt\'.\n  !\n  ! To the best understanding of the author, the expert\n  ! consensus is that a comparative sort can, at best,\n  ! be done with O(nlogn) time complexity. Here a sort\n  ! is demonstrated which experimentally runs O(n).\n  !\n  ! Such time complexity is currently considered impossible\n  ! for a sort. Using this sort, extremely large amounts of data can be\n  ! sorted on any modern computer using a single processor core -\n  ! provided the computer has enough memory to hold the array! For example,\n  ! the sorting time for a given array will be on par (perhaps less than)\n  ! what it takes the same computer to write the array to a file.\n  !\n  ! ~ Derrel Walters\n\n  IMPLICIT NONE\n\n  INTEGER,PARAMETER :: in_unit = 21\n  INTEGER,PARAMETER :: out_unit = 23\n\n  INTEGER,DIMENSION(:),ALLOCATABLE :: iArrA\n  REAL,DIMENSION(:),ALLOCATABLE :: rArrA\n  CHARACTER(LEN=15) :: cDims\n  CHARACTER(LEN=80) :: ioMsgStr\n  INTEGER(KIND=8) :: nDims, i\n  INTEGER :: iStat\n  REAL :: start, finish\n\n  WRITE(*,*) \'\'\n  WRITE(*,\'(A)\') \'Derrel Walters \xc2\xa9 2019\'\n  WRITE(*,*) \'\'\n  WRITE(*,\'(A)\') \'Demonstrating derrelSORT\xc2\xa9\'\n  WRITE(*,\'(A)\') \'WARNING: This program can produce LARGE files!\'\n  WRITE(*,*) \'\'\n\n  CALL GET_COMMAND_ARGUMENT(1, cDims)\n  IF (cDims == \'\') THEN\n    nDims = 1000000\n  ELSE\n    READ(cDims,\'(1I15)\') nDims\n  END IF\n  ALLOCATE(iArrA(nDims),rArrA(nDims),STAT=iStat)\n\n  WRITE(*,\'(A,1X,1I16)\') \'Generating random array of length:\', nDims\n  CALL CPU_TIME(start)\n  CALL RANDOM_NUMBER(rArrA)\n  iArrA = INT(rArrA*1000000)\n  CALL CPU_TIME(finish)\n  WRITE(*,\'(A,1X,f9.4,1X,A)\') \'Time =\',finish-start,\'seconds\'\n  DEALLOCATE(rArrA,STAT=iStat)\n\n  WRITE(*,\'(A)\') \'Writing Array to rand-in.txt: \'\n  OPEN(UNIT=in_unit,FILE=\'rand-in.txt\',STATUS=\'REPLACE\',ACTION=\'WRITE\',IOSTAT=iStat,IOMSG=ioMsgStr)\n  IF (iStat /= 0) T

Jim*_*hel 10

你的算法不是 O(n)。您计算出的循环次数 ( nLoops) 是log2(n)。内部循环的数量( 中的值iterMark)本质上是 n/2、n/4、n/8 等。但是段大小实际上并不重要,因为每次通过外部循环时,您都会查看列表。

无论您如何混淆它,您都会执行 log2(n) 传递 n 个项目:O(n log n)。

您的代码是相当标准的合并排序,已证明其时间复杂度为 O(n log n)。已经充分证明,比较排序的一般情况是 O(n log n)。当然,某些算法可以更快地对某些特定情况进行排序。相反,相同的算法也有需要 O(n^2) 的病态情况。其他比较排序(例如堆排序、合并排序)不太受项目顺序的影响。但在一般情况下,比较排序按 n log n 次比较的顺序进行。有关详细说明,请参阅https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0913.pdf 。

但不要相信我的话。您可以通过做一些简单的计时来轻松测试自己。计算对 100K 件物品进行排序所需的时间。如果您的算法确实是 O(n),那么对 200K 个项目进行排序大约需要两倍的时间,对 100 万个项目进行排序需要十倍的时间。但如果是 O(n log n),正如我怀疑的那样,那么时间会更长一些。

考虑:100K 的 log(2) 是 16.61。200K 的 log(2) 为 17.61。因此,排序 100K 项(如果算法为 O(log n))将花费与 100K * 16.61 成比例的时间。对 200K 个项目进行排序所需的时间与 200K * 17.71 成正比。做算术:

100K * 16.61 = 1,661,000
200K * 17.61 = 3,522,000
Run Code Online (Sandbox Code Playgroud)

因此,200K 项大约需要 2.12 倍 (3,522,000/1,661,000) 的时间。或者,比线性算法长大约 10%。

如果您仍然不确定,请将其增加到一百万个项目。如果算法是线性的,那么 100 万个项目所花费的时间是 100K 个项目所花费时间的 10 倍。如果是 O(n log n),则需要 12 倍的时间。

1M * 19.93 = 19,930,000
(19,930,000 / 1,661,000) = 11.9987 (call it 12)
Run Code Online (Sandbox Code Playgroud)

  • @DJWalters 你自己的数字显示每个元素在“O(n log(n))”预期范围内的减速。如果您对代码进行检测来计算比较,这将显示出与理论的更好匹配。而且,无论您是否理解,您声称在线性时间内运行比较排序都是不可能的。 (5认同)
  • @DJWalters在您的数据集中,每个元素排序的时间数字从 1.56e-07 秒上升到 2.4314892578125e-07 秒。增幅约为55.9%。它略低于“O(n log(n))”的理论 72.2%,因为您花费一些时间做诸如复制数据之类的事情,这些事情呈线性增长。但是,通过进行线性缩放的比较排序,您并没有违反数学定律。你真的、真的、真的没有。 (4认同)
  • @DJWalters 而且,不,我在评论之前没有运行代码。正如我所说,您所拥有的是相当标准的迭代合并排序的混淆实现。我已经看过数百次,自己也写过数十次。当简短的分析告诉我需要知道的一切时,我不需要运行代码。 (4认同)
  • 你可能不明白为什么这是不可能的,但这是你的问题,不是我的。您获得了 https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0913.pdf 的链接,其中包含证明这是不可能的。 (4认同)
  • @DJWalters 没有人说你的算法不快。我们说它不是线性的。它在实践中比“n”写入操作运行得更快确实表明您已经想出了一个有效的算法。它不显示线性时间复杂度。当你接近无穷大时,数学上的收敛需要发生,但无论你设法运行多少固定步骤,它都不必发生。 (4认同)
  • @DJWalters 你显然不清楚的概念:“证明”意味着什么,big-O 是如何工作的,以及 log(n) 增长得有多慢。我不会再费心回复。你是唯一一个不清楚你的代码如何扩展的人,而且你太忙于争论为什么你是对的,而没有时间处理你为什么错了。 (2认同)