这个素数发生器的执行时间能否得到改善?

Cha*_*ion 10 optimization f#

写这篇文章的最初目标是尽可能减少占地面积.我可以充满信心地说,这个目标已经实现.不幸的是,这让我的执行速度相当慢.为了产生低于200万的所有质数,在3Ghz英特尔芯片上需要大约8秒.

反正有没有改善这段代码的执行时间,而对内存占用空间的贡献最小?或者,从功能的角度来看,我是否会以错误的方式解决这个问题?

/// 6.5s for max = 2,000,000
let generatePrimeNumbers max =    
    let rec generate number numberSequence =
        if number * number > max then numberSequence else
        let filteredNumbers = numberSequence |> Seq.filter (fun v -> v = number || v % number <> 0L)
        let newNumberSequence = seq { for i in filteredNumbers -> i }
        let newNumber = newNumberSequence |> Seq.find (fun x -> x > number)
        generate newNumber newNumberSequence                
    generate 2L (seq { for i in 2L..max -> i })
Run Code Online (Sandbox Code Playgroud)

更新

我调整了算法并设法削减了2秒,但内存消耗增加了一倍.

/// 5.2s for max = 2,000,000
let generatePrimeNumbers max =    
    let rec generate number numberSequence =
        if number * number > max then numberSequence else
        let filteredNumbers = numberSequence |> Seq.filter (fun v -> v = number || v % number <> 0L) |> Seq.toArray |> Array.toSeq
        let newNumber = filteredNumbers |> Seq.find (fun v -> v > number)
        generate newNumber filteredNumbers                
    generate 2L (seq { for i in 2L..max -> i })
Run Code Online (Sandbox Code Playgroud)

更新

显然,我使用的是旧编译器.使用最新版本,我的原始算法需要6.5s而不是8s.这是一个很大的改进.

Jul*_*iet 8

Tomas Petricek的功能非常快,但我们可以让它快一点.

比较以下内容:

let is_prime_tomas n =
    let ms = int64(sqrt(float(n)))
    let rec isPrimeUtil(m) =
        if m > ms then true
        elif n % m = 0L then false
        else isPrimeUtil(m + 1L)
    (n > 1L) && isPrimeUtil(2L)

let is_prime_juliet n =
    let maxFactor = int64(sqrt(float n))
    let rec loop testPrime tog =
        if testPrime > maxFactor then true
        elif n % testPrime = 0L then false
        else loop (testPrime + tog) (6L - tog)
    if n = 2L || n = 3L || n = 5L then true
    elif n <= 1L || n % 2L = 0L || n % 3L = 0L || n % 5L = 0L then false
    else loop 7L 4L
Run Code Online (Sandbox Code Playgroud)

is_prime_juliet内循环稍快一点.它是一个众所周知的素数生成策略,它使用"切换"以2或4的增量跳过非素数.用于比较:

> seq { 2L .. 2000000L } |> Seq.filter is_prime_tomas |> Seq.fold (fun acc _ -> acc + 1) 0;;
Real: 00:00:03.628, CPU: 00:00:03.588, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 148933

> seq { 2L .. 2000000L } |> Seq.filter is_prime_juliet |> Seq.fold (fun acc _ -> acc + 1) 0;;
Real: 00:00:01.530, CPU: 00:00:01.513, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 148933
Run Code Online (Sandbox Code Playgroud)

我的版本速度提高了2.37倍,而且它也非常接近最快命令版本的速度.我们可以使它更快,因为我们不需要过滤列表2L .. 2000000L,我们可以使用相同的策略在我们应用过滤器之前生成更优化的可能素数序列:

> let getPrimes upTo =
    seq {
        yield 2L;
        yield 3L;
        yield 5L;
        yield! (7L, 4L) |> Seq.unfold (fun (p, tog) -> if p <= upTo then Some(p, (p + tog, 6L - tog)) else None)
    }
    |> Seq.filter is_prime_juliet;;
Real: 00:00:00.000, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0

val getPrimes : int64 -> seq<int64>

> getPrimes 2000000L |> Seq.fold (fun acc _ -> acc + 1) 0;;
Real: 00:00:01.364, CPU: 00:00:01.357, GC gen0: 36, gen1: 1, gen2: 0
val it : int = 148933
Run Code Online (Sandbox Code Playgroud)

我们从1.530秒下降到01.364秒,因此我们的速度提高了1.21倍.真棒!


Jul*_*iet 7

只是为了测试,我们来看看这个页面.

pi(x)是素数计数函数,它返回x以下的素数.您可以使用以下不等式近似pi(x):

(x/log x)(1 + 0.992/log x) < pi(x) < (x/log x)(1 + 1.2762/log x) 
// The upper bound holds for all x > 1
Run Code Online (Sandbox Code Playgroud)

p(x)是第n个素数函数,可以使用下式近似:

n ln n + n ln ln n - n < p(n) < n ln n + n ln ln n
// for n >= 6
Run Code Online (Sandbox Code Playgroud)

考虑到这一点,这里有一个非常快速的生成器,它计算前n个素数,其中每个元素在索引处i相等p(i).因此,如果我们想要将数组限制在低于2,000,000的素数,我们使用上限不等式作为素数计数函数:

let rec is_prime (primes : int[]) i testPrime maxFactor =
    if primes.[i] > maxFactor then true
    else
        if testPrime % primes.[i] = 0 then false
        else is_prime primes (i + 1) testPrime maxFactor

let rec prime_n (primes : int[]) primeCount testPrime tog =
    if primeCount < primes.Length then
        let primeCount' =
            if is_prime primes 2 testPrime (float testPrime |> sqrt |> int) then
                primes.[primeCount] <- testPrime
                primeCount + 1
            else
                primeCount
        prime_n primes primeCount' (testPrime + tog) (6 - tog)

let getPrimes upTo =
    let x = float upTo
    let arraySize = int <| (x / log x) * (1.0 + 1.2762 / log x)
    let primes = Array.zeroCreate (max arraySize 3)
    primes.[0] <- 2
    primes.[1] <- 3
    primes.[2] <- 5

    prime_n primes 3 7 4
    primes
Run Code Online (Sandbox Code Playgroud)

凉!那有多快?在我的3.2ghz四核上,我在fsi中得到以下内容:

> let primes = getPrimes 2000000;;
Real: 00:00:00.534, CPU: 00:00:00.546, GC gen0: 0, gen1: 0, gen2: 0

val primes : int [] =
  [|2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71;
    73; 79; 83; 89; 97; 101; 103; 107; 109; 113; 127; 131; 137; 139; 149; 151;
    157; 163; 167; 173; 179; 181; 191; 193; 197; 199; 211; 223; 227; 229; 233;
    239; 241; 251; 257; 263; 269; 271; 277; 281; 283; 293; 307; 311; 313; 317;
    331; 337; 347; 349; 353; 359; 367; 373; 379; 383; 389; 397; 401; 409; 419;
    421; 431; 433; 439; 443; 449; 457; 461; 463; 467; 479; 487; 491; 499; 503;
    509; 521; 523; 541; ...|]

> printfn "total primes: %i. Last prime: %i" (primes.Length - 1) primes.[primes.Length - 1];;
total primes: 149973. Last prime: 2014853
Run Code Online (Sandbox Code Playgroud)

所以在不到半秒的时间里,所有素数都在200万左右:)