为什么F#比C#慢得多?(素数基准)

Ala*_*nay 12 c# benchmarking primes f#

我认为F#意味着比C#更快,我做了一个可能不好的基准测试工具,C#得到16239ms,而F#做得更差,为49583ms.有人可以解释为什么会这样吗?我正在考虑离开F#并回到C#.是否可以通过更快的代码在F#中获得相同的结果?

这是我使用的代码,我尽可能地使它成为平等.

F#(49583ms)

open System
open System.Diagnostics

let stopwatch = new Stopwatch()
stopwatch.Start()

let mutable isPrime = true

for i in 2 .. 100000 do
    for j in 2 .. i do
        if i <> j && i % j = 0 then
            isPrime <- false
    if isPrime then
        printfn "%i" i
    isPrime <- true

stopwatch.Stop()
printfn "Elapsed time: %ims" stopwatch.ElapsedMilliseconds

Console.ReadKey() |> ignore
Run Code Online (Sandbox Code Playgroud)

C#(16239ms)

using System;
using System.Diagnostics;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();

            bool isPrime = true;

            for (int i = 2; i <= 100000; i++)
            {
                for (int j = 2; j <= i; j++)
                {
                    if (i != j && i % j == 0)
                    {
                        isPrime = false;
                        break;
                    }
                }
                if (isPrime)
                {
                    Console.WriteLine(i);
                }
                isPrime = true;
            }
            stopwatch.Stop();
            Console.WriteLine("Elapsed time: " + stopwatch.ElapsedMilliseconds + "ms");
            Console.ReadKey();
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

rmu*_*unn 16

F#程序较慢,因为您的程序不相同.您的C#代码break在内部for循环中有一个语句,但您的F#程序没有.因此,对于每个偶数,C#代码将在检查可分性2后停止,而F#程序将检查从2到2的每个数字i.由于完成的工作有如此大的差异,实际上令人惊讶的是F#代码慢了三倍!

现在,F#故意没有break声明,所以你不能将C#代码直接翻译成F#.但您可以使用包含短路逻辑的功能.例如,在评论中,Aaron M. Eshbach建议如下:

{2 .. 100000}
|> Seq.filter (fun i -> {2 .. i-1} |> Seq.forall (fun j -> i % j <> 0))
|> Seq.iter (printfn "%i")
Run Code Online (Sandbox Code Playgroud)

这样Seq.forall做会使用,它会短路:它会根据条件检查序列中的每个输入,如果条件返回false,它将停止并不再进行检查.(因为Seq模块中的函数是惰性的,并且除了获得输出之外绝对不需要工作).这就像break在C#代码中有一个.

我将逐步完成这一步,以便您了解它是如何工作的:

{2 .. 100000}
Run Code Online (Sandbox Code Playgroud)

这会创建一个懒惰的整数序列,从2开始并上升到(包括)100000.

|> Seq.filter (fun i -> (some expression involving i))
Run Code Online (Sandbox Code Playgroud)

我把下一行分为两部分:外部Seq.filter部分和涉及的内部表达式i.该Seq.filter部分接受序列并过滤它:对于序列中的每个项目,调用它i并评估表达式.如果该表达式求值为true,则保留该项并将其传递给链中的下一步.如果表达式为false,则抛弃该项.

现在,涉及的表达式i是:

{2 .. i-1} |> Seq.forall (fun j -> i % j <> 0)
Run Code Online (Sandbox Code Playgroud)

这首先构造一个从2开始并一直向上i减1 的延迟序列.(或者你可以把它想象为从2开始到达i,但不包括i).然后它检查该序列中的每个项是否满足某个条件(即该Seq.forall函数).并且,作为一个实现细节Seq.forall,因为它是懒惰的并且没有比它绝对必须的更多的工作,它找到false结果的那一刻它将停止并且不再经历输入序列.(因为一旦找到一个反例,forall函数就不再可能返回true,所以只要知道结果就会停止.)检查的表达式是Seq.forall什么?是的fun j -> i % j <> 0.所以j是内循环变量,i是外变量(一个在所分配的Seq.filter部分),并且逻辑是一样的作为C#环路.

现在,请记住我们在Seq.filter这里.所以如果Seq.forall返回true,那么Seq.filter将保持值i.但是如果Seq.forall返回false,那么Seq.filter将丢弃此值i从传递到下一步.

最后,我们将此行作为下一步(也是最后一步):

|> Seq.iter (printfn "%i")
Run Code Online (Sandbox Code Playgroud)

这样做几乎完全相同:

for number in inputSoFar do
    printfn "%i" number
Run Code Online (Sandbox Code Playgroud)

(printfn "%i")如果你是F#的新手,这个电话可能看起来很不寻常.这是一个令人讨厌的问题,这是一个非常有用的概念,重要的是要习惯.所以花点时间考虑一下:在F#中,以下两行代码是完全等价的:

(fun y -> someFunctionCall x y)
(someFunctionCall x)
Run Code Online (Sandbox Code Playgroud)

所以fun item -> printfn "%i" item总能被替换printfn "%i.并且Seq.iter相当于一个for循环:

inputSoFar |> Seq.iter (someFunctionCall x)
Run Code Online (Sandbox Code Playgroud)

完全等同于:

for item in inputSoFar do
    someFunctionCall x item
Run Code Online (Sandbox Code Playgroud)

所以你有它:为什么你的F#程序比较慢,以及如何编写一个F#程序,它将遵循与C#一样的逻辑,但它将具有相当于break它的语句.

  • @Alanay - 了解F#的列表函数(以及序列函数,它们具有基本相同的API)的最佳方式是Scott Wlaschin的**优秀**站点:https://fsharpforfunandprofit.com/posts/list-module-functions/ (4认同)

Ste*_*evo 8

我知道有一个已经被接受的答案,但只想添加这个.

多年来做了很多C#,但F#并不多.以下内容更类似于C#代码.

open System
open System.Diagnostics

let stopwatch = new Stopwatch()
stopwatch.Start()

let mutable loop = true

for i in 2 .. 100000 do
    let mutable j = 2
    while loop do
        if i <> j && i % j = 0 then
            loop <- false
        else
            j <- j + 1
            if j >= i then
                printfn "%i" i
                loop <- false
    loop <- true

stopwatch.Stop()
printfn "Elapsed time: %ims" stopwatch.ElapsedMilliseconds
Run Code Online (Sandbox Code Playgroud)

在我对LinqPad的测试中,上述速度比Aaron M. Eshbach提出的解决方案更快.

它也出现了令人惊讶的类似IL.

  • 这将略微加快,因为它消除了调用`Seq.filter`,`Seq.forall`和`Seq.iter`函数的开销.通过数十万或数百万次迭代,即使是几个函数调用的微小开销也会增加.我仍然会使用序列处理函数,因为意思更清晰,代码更惯用. (4认同)

Jus*_*mer 7

正如其他提到的那样,代码没有做同样的事情,你需要采用技术来确保在找到素数后停止内循环.

此外,您要将值打印到标准输出.当您进行CPU性能测试时,通常不需要这样做,因为大量时间可能会导致I/O偏差测试结果.

无论如何,即使有一个已接受的答案,我还是决定对此进行修补,以便看看将不同的建议解决方案与我自己的解决方案进行比较.

性能运行x64在.NET 4.7.1上处于模式.

我比较了不同的F#解决方案以及我自己的一些变体:

Running 'Original(F#)' with 100000 (10512)...
  ... it took 14533 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'Original(C#)' with 100000 (10512)...
  ... it took 1343 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'Aaron' with 100000 (10512)...
  ... it took 5027 ms with (3, 1, 0) cc and produces 9592 GOOD primes
Running 'SteveJ' with 100000 (10512)...
  ... it took 1640 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'Dumetrulo1' with 100000 (10512)...
  ... it took 1908 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'Dumetrulo2' with 100000 (10512)...
  ... it took 970 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'Simple' with 100000 (10512)...
  ... it took 621 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'PushStream' with 100000 (10512)...
  ... it took 1627 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'Unstalling' with 100000 (10512)...
  ... it took 551 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'Vectors' with 100000 (10512)...
  ... it took 1076 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'VectorsUnstalling' with 100000 (10512)...
  ... it took 1072 ms with (0, 0, 0) cc and produces 9592 GOOD primes
Running 'BestAttempt' with 100000 (10512)...
  ... it took 4 ms with (0, 0, 0) cc and produces 9592 GOOD primes  
Run Code Online (Sandbox Code Playgroud)
  1. Original(F#) - OP的原始F#代码更改为不使用stdout
  2. Original(C#) - OP的原始C#代码更改为不使用stdout
  3. Aaron- 使用的惯用法Seq.正如预期的那样Seq,表现通常并不顺利.
  4. SteveJ - @SteveJ试图模仿F#中的C#代码
  5. Dumetrulo1 - @dumetrulo在尾递归中实现了算法
  6. Dumetrulo2 - @dumetrulo通过步进+2而不是+1来改进算法(不需要检查偶数).
  7. Simple- 我尝试使用类似的方法进行Dumetrulo2尾递归.
  8. PushStream- 我尝试使用简单的推送流(Seq是拉流)
  9. Unstalling - 我试图在所使用的指令具有延迟的情况下尝试卸载CPU
  10. Vectors- 我尝试System.Numerics.Vectors每次操作使用多个分区(也就是SIMD).不幸的是,矢量库不支持mod所以我不得不模仿它.
  11. VectorsUnstalling- 我试图Vectors通过尝试卸载CPU 来改进.
  12. BestAttempt- 喜欢Simple但仅sqrt n在确定是否为素数时检查数字.

包起来

  1. F#循环没有也continue没有break.F#中的尾递归是IMO实现需要的循环的更好方法break.
  2. 在比较语言的性能时,应该比较最佳性能或比较惯用解决方案的性能吗?我个人认为最好的表现是正确的方式,但我知道人们不同意我(我写了一个mandelbrot版本用于F#游戏的基准测试,其性能与C相当但不被接受,因为这种风格被认为是非-Fio的F#).
  3. Seq在F#中,不幸的是增加了很多开销.即使开销不相关,我也很难使用它.
  4. 现代CPU指令具有不同的吞吐量和延迟数.这意味着有时为了加速性能,需要在内部循环中处理多个独立样本,以允许乱序执行单元重新排序程序以隐藏延迟.如果您的CPU具有超线程并且您在多个线程上运行算法,则超线程可以"自动"缓解延迟.
  5. 缺少mod向量阻止了尝试使用SIMD获得超过非SIMD解决方案的任何性能.
  6. 如果我修改Unstalling尝试循环次数与C#代码相同,则最终结果1100 ms在F#中与1343 msC#相比.因此,F#可以与C#非常相似.如果再应用一些技巧,它只需要4 msC#也是如此.无论如何,从几乎15 sec到相当不错4 ms.

希望有人感兴趣

完整源代码:

module Common = 
  open System
  open System.Diagnostics

  let now =
    let sw = Stopwatch ()
    sw.Start ()
    fun () -> sw.ElapsedMilliseconds

  let time i a =
    let inline cc i       = GC.CollectionCount i

    let ii = i ()

    GC.Collect (2, GCCollectionMode.Forced, true)

    let bcc0, bcc1, bcc2  = cc 0, cc 1, cc 2
    let b                 = now ()

    let v = a ii

    let e = now ()
    let ecc0, ecc1, ecc2  = cc 0, cc 1, cc 2

    v, (e - b), ecc0 - bcc0, ecc1 - bcc1, ecc2 - bcc2

  let limit    = 100000
  // pi(x) ~= limit/(ln limit - 1)
  // Using pi(x) ~= limit/(ln limit - 2) to over-estimate
  let estimate = float limit / (log (float limit) - 1.0 - 1.0) |> round |> int

module Original =
  let primes limit =
    let ra = ResizeArray Common.estimate

    let mutable isPrime = true

    for i in 2 .. limit do
      for j in 2 .. i do
        if i <> j && i % j = 0 then
          isPrime <- false
      if isPrime then
          ra.Add i
      isPrime <- true

    ra.ToArray ()

module SolutionAaron =
  let primes limit =
    {2 .. limit}
    |> Seq.filter (fun i -> {2 .. i-1} |> Seq.forall (fun j -> i % j <> 0))
    |> Seq.toArray

module SolutionSteveJ =
  let primes limit =
    let ra = ResizeArray Common.estimate
    let mutable loop = true

    for i in 2 .. limit do
        let mutable j = 2
        while loop do
            if i <> j && i % j = 0 then
                loop <- false
            else
                j <- j + 1
                if j >= i then
                    ra.Add i
                    loop <- false
        loop <- true

    ra.ToArray ()

module SolutionDumetrulo1 =
  let rec isPrimeLoop (ra : ResizeArray<_>) i j limit =
    if i > limit then ra.ToArray ()
    elif j > i then
      ra.Add i
      isPrimeLoop ra (i + 1) 2 limit
    elif i <> j && i % j = 0 then
      isPrimeLoop ra (i + 1) 2 limit
    else
      isPrimeLoop ra i (j + 1) limit

  let primes limit =
    isPrimeLoop (ResizeArray Common.estimate) 2 2 limit

module SolutionDumetrulo2 =
  let rec isPrimeLoop (ra : ResizeArray<_>) i j limit =
    let incr x = if x = 2 then 3 else x + 2
    if i > limit then ra.ToArray ()
    elif j > i then
      ra.Add i
      isPrimeLoop ra (incr i) 2 limit
    elif i <> j && i % j = 0 then
      isPrimeLoop ra (incr i) 2 limit
    else
      isPrimeLoop ra i (incr j) limit

  let primes limit =
    isPrimeLoop (ResizeArray Common.estimate) 2 2 limit

module SolutionSimple =
  let rec isPrime i j k =
    if i < k then
      (j % i) <> 0 && isPrime (i + 2) j k
    else
      true

  let rec isPrimeLoop (ra : ResizeArray<_>) i limit =
    if i < limit then 
      if isPrime 3 i i then
        ra.Add i
      isPrimeLoop ra (i + 2) limit
    else
      ra.ToArray ()

  let primes limit =
    let ra = ResizeArray Common.estimate
    ra.Add 2
    isPrimeLoop ra 3 limit

module SolutionPushStream =
  type Receiver<'T> = 'T -> bool
  type PushStream<'T> = Receiver<'T> -> bool

  module Details =
    module Loops =
      let rec range e r i =
        if i <= e then
          if r i then
            range e r (i + 1)
          else
            false
        else
          true

  open Details

  let range s e : PushStream<int> =
    fun r -> Loops.range e r s

  let filter p (t : PushStream<'T>) : PushStream<'T> =
    fun r -> t (fun v -> if p v then r v else true)

  let forall p (t : PushStream<'T>) : bool =
    t p

  let toArray (t : PushStream<'T>) : _ [] =
    let ra = ResizeArray 16

    t (fun v -> ra.Add v; true) |> ignore

    ra.ToArray ()

  let primes limit =
    range 2 limit
    |> filter (fun i -> range 2 (i - 1) |> forall (fun j -> i % j <> 0))
    |> toArray

module SolutionUnstalling =
  let rec isPrime i j k =
    if i + 6 < k then
      (j % i) <> 0 && (j % (i + 2)) <> 0 && (j % (i + 4)) <> 0 && (j % (i + 6)) <> 0  && isPrime (i + 8) j k
    else
      true

  let rec isPrimeLoop (ra : ResizeArray<_>) i limit =
    if i < limit then 
      if isPrime 3 i i then
        ra.Add i
      isPrimeLoop ra (i + 2) limit
    else
      ra.ToArray ()

  let primes limit =
    let ra = ResizeArray Common.estimate
    ra.Add 2
    ra.Add 3
    ra.Add 5
    ra.Add 7
    ra.Add 11
    ra.Add 13
    ra.Add 17
    ra.Add 19
    ra.Add 23
    isPrimeLoop ra 29 limit

module SolutionVectors =
  open System.Numerics

  assert (Vector<int>.Count = 4)

  type I4 = Vector<int>

  let inline (%%) (i : I4) (j : I4) : I4 =
    i - (j * (i / j))

  let init : int [] = Array.zeroCreate 4

  let i4 v0 v1 v2 v3 =
    init.[0] <- v0
    init.[1] <- v1
    init.[2] <- v2
    init.[3] <- v3
    I4 init

  let i4_ (v0 : int) =
    I4 v0

  let zero    = I4.Zero
  let one     = I4.One 
  let two     = one + one
  let eight   = two*two*two

  let step = i4 3 5 7 9

  let rec isPrime (i : I4) (j : I4) k l =
    if l + 6 < k then
      Vector.EqualsAny (j %% i, zero) |> not && isPrime (i + eight) j k (l + 8)
    else
      true

  let rec isPrimeLoop (ra : ResizeArray<_>) i limit =
    if i < limit then 
      if isPrime step (i4_ i) i 3 then
        ra.Add i
      isPrimeLoop ra (i + 2) limit
    else
      ra.ToArray ()

  let primes limit =
    let ra = ResizeArray Common.estimate
    ra.Add 2
    ra.Add 3
    ra.Add 5
    ra.Add 7
    ra.Add 11
    ra.Add 13
    ra.Add 17
    ra.Add 19
    ra.Add 23
    isPrimeLoop ra 29 limit

module SolutionVectorsUnstalling =
  open System.Numerics

  assert (Vector<int>.Count = 4)

  type I4 = Vector<int>

  let init : int [] = Array.zeroCreate 4

  let i4 v0 v1 v2 v3 =
    init.[0] <- v0
    init.[1] <- v1
    init.[2] <- v2
    init.[3] <- v3
    I4 init

  let i4_ (v0 : int) =
    I4 v0

  let zero    = I4.Zero
  let one     = I4.One 
  let two     = one + one
  let eight   = two*two*two
  let sixteen = two*eight

  let step = i4 3 5 7 9

  let rec isPrime (i : I4) (j : I4) k l =
    if l + 14 < k then
      // i - (j * (i / j))      
      let i0 = i
      let i8 = i + eight
      let d0 = j / i0
      let d8 = j / i8
      let n0 = i0 * d0
      let n8 = i8 * d8
      let r0 = j - n0
      let r8 = j - n8
      Vector.EqualsAny (r0, zero) |> not && Vector.EqualsAny (r8, zero) |> not && isPrime (i + sixteen) j k (l + 16)
    else
      true

  let rec isPrimeLoop (ra : ResizeArray<_>) i limit =
    if i < limit then 
      if isPrime step (i4_ i) i 3 then
        ra.Add i
      isPrimeLoop ra (i + 2) limit
    else
      ra.ToArray ()

  let primes limit =
    let ra = ResizeArray Common.estimate
    ra.Add 2
    ra.Add 3
    ra.Add 5
    ra.Add 7
    ra.Add 11
    ra.Add 13
    ra.Add 17
    ra.Add 19
    ra.Add 23
    isPrimeLoop ra 29 limit

module SolutionBestAttempt =
  let rec isPrime i j k =
    if i < k then
      (j % i) <> 0 && isPrime (i + 2) j k
    else
      true

  let inline isqrt i = (i |> float |> sqrt) + 1. |> int

  let rec isPrimeLoop (ra : ResizeArray<_>) i limit =
    if i < limit then 
      if isPrime 3 i (isqrt i) then
        ra.Add i
      isPrimeLoop ra (i + 2) limit
    else
      ra.ToArray ()

  let primes limit =
    let ra = ResizeArray Common.estimate
    ra.Add 2
    isPrimeLoop ra 3 limit

[<EntryPoint>]
let main argv =

  let testCases =
    [|
      "Original"    , Original.primes
      "Aaron"       , SolutionAaron.primes
      "SteveJ"      , SolutionSteveJ.primes
      "Dumetrulo1"  , SolutionDumetrulo1.primes
      "Dumetrulo2"  , SolutionDumetrulo2.primes
      "Simple"            , SolutionSimple.primes
      "PushStream"        , SolutionPushStream.primes
      "Unstalling"        , SolutionUnstalling.primes
      "Vectors"           , SolutionVectors.primes
      "VectorsUnstalling" , SolutionVectors.primes
      "BestAttempt"       , SolutionBestAttempt.primes
    |]

  do
    // Warm-up
    printfn "Warm up"
    for _, a in testCases do
      for i = 0 to 100 do
        a 100 |> ignore

  do
    let init ()   = Common.limit

    let expected  = SolutionSimple.primes Common.limit

    for testCase, a in testCases do
      printfn "Running '%s' with %d (%d)..." testCase Common.limit Common.estimate
      let actual, time, cc0, cc1, cc2 = Common.time init a
      let result = if expected = actual then "GOOD" else "BAD"
      printfn "  ... it took %d ms with (%d, %d, %d) cc and produces %d %s primes" time cc0 cc1 cc2 actual.Length result 

  0
Run Code Online (Sandbox Code Playgroud)