当阵列大于800,000,000时,我的Fortran筛子会显着减慢

aPh*_*ist 6 performance primes fortran

我是一名物理学家,最近我一直在和Fortran合作.最初我使用Java进行广泛的娱乐,因为它是我学习的第一种语言,但我已经放弃了它用于Fortran和C++.我对素数有业余热情,因此我创建了一个素数筛子.我能够在15秒内找到最多2 ^ 31的所有素数.这是Java的最大数组大小,所以这就是结束.我小心地移植了代码(我的意思是,我很沮丧,我的代码很慢,而且我找不到错误,我将我的Fortran代码移植回Java以验证它不是我的错,然后移植回来到Fortran,擦除每次迭代!).问题是大约800,000,000 Fortran将陷入停顿.到目前为止,它击败了Java,但在此之后,它的速度非常慢.我花了几个小时绘制它并拟合曲线.它以指数方式减速,可能需要数百年才能解决Javas级别问题.我问过很多人无济于事.为什么这发生在我身上?!?!有没有明智的Fortran程序员可以帮助我?我正在运行Macbook Pro 2013年末i5.我的代码如下.

program sieve
integer(4),allocatable:: prime(:)
integer(4)::a,max,b,primeCount
write(*,*)"Welcome to the slow prime number sieve!"
write(*,*)"--------------------------------------------"
write(*,*)"Up to what numbers do you need to find primes for?"
write(*,*)"Enter a number below 2^(32-1)"
read*, max

primeCount=0
allocate(prime(max))
prime(1)=1

    do a=2,int(sqrt(real(max))) !the main loop
        if(prime(a)==0)then !if the number is marked as prime procede
            do b=2*a,max,a  !eliminate all the numbers that are multiples of the number
                if(prime(b)==0)then !but only spend time eliminating if the number is still marked prime
                    prime(b)=1
                end if
            end do
        end if
    end do

do a=1,max
    if(prime(a)==0)then
        primeCount=primeCount+1
    end if
end do

print*, primeCount

end program
Run Code Online (Sandbox Code Playgroud)

编辑:我的机器安装了4 Gig的ram

Pet*_*trH 4

I can see several things you could do to speed the code up although none of them seems likely to explain the sharp drop in performance you experience. The most likely culprit seems to be the RAM constraint as Alexander Vogt suggests.

The first thing you should do is to change prime from integer to logical array. This reduces the memory requirements and also speeds up the evaluations of if (prime(a)==0).

The relevant parts of the code would then look as follows

logical(1),allocatable:: prime(:)

primeCount=0
allocate(prime(max))
prime = .false.
prime(1)=.true.

    do a=2,int(sqrt(real(max))) !the main loop
        if(.not. prime(a))then !if the number is marked as prime procede
            do b=2*a,max,a  !eliminate all the numbers that are multiples of the number
                if(.not. prime(b))then !but only spend time eliminating if the number is still marked prime
                    prime(b)=.true.
                end if
            end do
        end if
    end do

do a=1,max
    if(.not. prime(a))then
        primeCount=primeCount+1
    end if
end do
Run Code Online (Sandbox Code Playgroud)

I don't do any Java programming but in Matlab if you declare prime(1:max)=0 and then only switch values bewteen 0 and 1 I would think that Matlab treats the array as a logical array. Java might be doing the same. This could explain why your Java code does not suffer from the performance degradation (assuming the RAM constraint is really the issue).

EDIT: I have done a few experiments.

On my machine (Debian Linux 64bit, i3, 16GB RAM, ifort 14 with default flags) it took 22 seconds for max=800 million (8E8). It took 60 seconds for max=2E9. This is not hours as reported in the question. Also in each case the prime array happened to be initialized to zero.

Moreover using integer(1) makes the programme run 33% faster then using integer(4). With logical(1) it runs less then 5% faster then with integer(1). This behaviour is probably due to the better use of cash as each element of prime occupies less space in memory therefore the processor can do a larger number of iterations with the data currently in cash going through the loops much faster.

The conclusions I would draw from this would be that the culprit was the lack of RAM as noted by Alexander Vogt and that there is a fair chance the author's experience was not affected by the omission to initialize the prime array (although that definitely should not have happened) as HighPerformanceMark pointed out. Further I suspect that Java declared prime as a logical array and that is why the problem did not occur there. (Although 2^31 in 15 seconds in Java? The Fortran code used here comes nowhere close to this. Was really the same code compared?)