如何用F#成语提高性能

Tao*_*Gil 8 performance f#

我正在使用机器学习课程同时学习F#.我做了以下作业练习,这是第二周的第一次练习:

运行计算机模拟以翻转1,000个虚拟公平硬​​币.每个硬币独立翻转10次.专注于3个硬币如下:c1 是第一个硬币翻转,crand是从1,000中随机选择的硬币,而cmin是具有最小频率的硬币(在领带的情况下选择较早的硬币).

ν1,νrand ,和νmin是为3枚各自硬币获得了10次投掷的头部分.运行实验100,000次以获得ν1,νrand和νmin的完整分布(请注意,c rand和c min将从运行变为运行).

νmin的平均值是多少?

我已经生成了以下代码,它工作正常并给出了正确的答案:

let private rnd = System.Random()
let FlipCoin() = rnd.NextDouble() > 0.5
let FlipCoinNTimes N = List.init N (fun _ -> FlipCoin())
let FlipMCoinsNTimes M N = List.init M (fun _ -> FlipCoinNTimes N)

let ObtainFrequencyOfHeads tosses = 
    let heads = tosses |> List.filter (fun toss -> toss = true)
    float (List.length (heads)) / float (List.length (tosses))

let GetFirstRandMinHeadsFraction allCoinsLaunchs = 
    let first = ObtainFrequencyOfHeads(List.head (allCoinsLaunchs))
    let randomCoin = List.item (rnd.Next(List.length (allCoinsLaunchs))) allCoinsLaunchs
    let random = ObtainFrequencyOfHeads(randomCoin)

    let min = 
        allCoinsLaunchs
        |> List.map (fun coin -> ObtainFrequencyOfHeads coin)
        |> List.min
    (first, random, min)

module Exercice1 = 
    let GetResult() = 
        Seq.init 100000 (fun _ -> FlipMCoinsNTimes 1000 10)
        |> Seq.map (fun oneExperiment -> GetFirstRandMinHeadsFraction oneExperiment)
        |> Seq.map (fun (first, random, min) -> min)
        |> Seq.average
Run Code Online (Sandbox Code Playgroud)

但是,在我的机器上运行大约需要4分钟.我知道它做了很多工作,但我想知道是否有一些可以进行优化的修改.

当我正在尝试学习F#时,我要求使用F#成语的优化,而不是将代码更改为C风格.

随意提出任何改进,风格,良好做法等.

[UPDATE]

我已经写了一些代码来比较提出的解决方案,这里可以访问它.

这些是结果:

基数 - 结果:0.037510,已过去时间:00:00:55.1274883,改进:0.99 x

Matthew Mcveigh - 结果:0.037497,已过去的时间:00:00:15.1682052,改进:3.61 x

Fyodor Soikin - 结果:0.037524,时间流逝:00:01:29.7168787,改进:0.61 x

GuyCoder - 结果:0.037645,已过去的时间:00:00:02.0883482,改进:26.25 x

GuyCoder MathNet-结果:0.037666,已过去的时间:00:00:24.7596117,改进:2.21 x

TheQuickBrownFox - 结果:0.037494,已过去的时间:00:00:34.2831239,改进:1.60 x

关于时间的改善的胜者是GuyCoder,所以我会接受他的回答.但是,我发现他的代码更难理解.

Mat*_*igh 6

预先分配大量列表是繁重的工作,算法可以在线处理,例如通过序列或递归.我将所有工作转换为一些原始速度的尾递归函数(将由编译器转换为循环)

不保证是100%正确,但希望能给你一个我要去的地方的要点:

let private rnd = System.Random()
let flipCoin () = rnd.NextDouble() > 0.5

let frequencyOfHeads flipsPerCoin = 
    let rec countHeads numHeads i =
        if i < flipsPerCoin then
            let isHead = flipCoin ()
            countHeads (if isHead then numHeads + 1 else numHeads) (i + 1)
        else
            float numHeads

    countHeads 0 0 / float flipsPerCoin

let getFirstRandMinHeadsFraction numCoins flipsPerCoin = 
    let randomCoinI = rnd.Next numCoins

    let rec run first random min i =
        if i < numCoins then
            let frequency = frequencyOfHeads flipsPerCoin
            let first = if i = 0 then frequency else first
            let random = if i = randomCoinI then frequency else random
            let min = if min > frequency then frequency else min

            run first random min (i + 1)
        else
            (first, random, min)

    run 0.0 0.0 System.Double.MaxValue 0

module Exercice1 = 
    let getResult () = 
        let iterations, numCoins, numFlips = 100000, 1000, 10

        let getMinFromExperiment () =
            let (_, _, min) = getFirstRandMinHeadsFraction numCoins numFlips
            min

        let rec sumMinFromExperiments i sumOfMin =
            if i < iterations then
                sumMinFromExperiments (i + 1) (sumOfMin + getMinFromExperiment ())
            else
                sumOfMin

        let sum = sumMinFromExperiments 0 0.0
        sum / float iterations
Run Code Online (Sandbox Code Playgroud)


Guy*_*der 4

在我的计算机上运行您的代码并计时我得到:

\n\n
seconds: 68.481918\nresult: 0.47570994\n
Run Code Online (Sandbox Code Playgroud)\n\n

在我的计算机上运行我的代码并计时我得到:

\n\n
seconds: 14.003861\nvOne: 0.498963\nvRnd: 0.499793\nvMin: 0.037675\n
Run Code Online (Sandbox Code Playgroud)\n\n

vMin 最接近正确答案b0.01

\n\n

那差不多就是5x更快了。

\n\n

我并没有修改每种方法和数据结构来找出为什么有效以及什么有效,我只是用几十年的经验来指导我。显然,不存储中间值而只存储结果是一个很大的改进。具体来说coinTest,仅返回头的数量int,而不是结果列表。此外,不是为每次抛硬币获取随机数,而是为每个硬币获取随机数,然后使用该随机数的每个部分作为抛硬币,这是有利的。这节省了number of flips - 1对函数的调用。float而且我直到最后才避免使用值;我不认为这会节省 CPU 时间,但它确实简化了思考过程,int让我能够专注于其他效率。我知道这可能听起来很奇怪,但我思考的越少,得到的答案就越好。我也只跑了coinTest在必要时运行,例如仅运行第一个硬币,仅运行随机硬币,并寻找所有反面作为退出条件。

\n\n
namespace Workspace\n\nmodule main =\n\n    [<EntryPoint>]\n    let main argv = \n\n        let rnd = System.Random()\n        let randomPick (limit : int) : int = rnd.Next(limit)   // [0 .. limit) it\'s a Python habit\n\n        let numberOfCoins = 1000\n        let numberOfFlips = 10\n        let numberOfExperiements = 100000\n\n        let coinTest (numberOfFlips : int) : int =\n            let rec countHeads (flips : int) bitIndex (headCount : int) : int =\n                if bitIndex < 0 then headCount\n                else countHeads (flips >>> 1) (bitIndex-1) (headCount + (flips &&& 0x01))\n            countHeads (randomPick ((pown 2 numberOfFlips) - 1)) numberOfFlips 0\n\n        let runExperiement (numberOfCoins : int) (numberOfFlips : int) : (int * int * int) =\n            let (randomCoin : int) = randomPick numberOfCoins\n            let rec testCoin coinIndex (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone) : (int * int * int) =\n                if (coinIndex < numberOfCoins) then\n                    if (not cFirstDone || not cRanDone || not cMinDone) then\n                        if (cFirstDone && cMinDone && (coinIndex <> randomCoin)) then\n                             testCoin (coinIndex+1) (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone)\n                        else\n                            let headsTotal = coinTest numberOfFlips \n                            let (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone) =\n                                let cFirst = if coinIndex = 0 then headsTotal else cFirst\n                                let cRnd = if coinIndex = randomCoin then headsTotal else cRnd\n                                let cMin = if headsTotal < cMin then headsTotal else cMin\n                                let cRanDone = if (coinIndex >= randomCoin) then true else cRanDone\n                                let cMinDone = if (headsTotal = 0) then true else cMinDone\n                                (cFirst, cRnd, cMin, true, cRanDone, cMinDone)\n                            testCoin (coinIndex+1) (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone)\n                    else\n                        (cFirst, cRnd, cMin)\n                else\n                    (cFirst, cRnd, cMin)\n            testCoin 0 (-1,-1,10, false, false, false)\n\n        let runExperiements (numberOfExperiements : int) (numberOfCoins : int) ( numberOfFlips : int) =\n            let rec accumateExperiements index aOne aRnd aMin : (int * int * int) =\n                let (cOne,cRnd,cMin) = runExperiement numberOfCoins numberOfFlips\n                if index > numberOfExperiements then (aOne, aRnd, aMin)\n                else accumateExperiements (index + 1) (aOne + cOne) (aRnd + cRnd) (aMin + cMin)\n            let (aOne, aRnd, aMin) = accumateExperiements 0 0 0 0\n            let (vOne : double) = (double)(aOne) / (double)numberOfExperiements / (double)numberOfFlips\n            let (vRnd : double) = (double)(aRnd) / (double)numberOfExperiements / (double)numberOfFlips\n            let (vMin : double) = (double)(aMin) / (double)numberOfExperiements / (double)numberOfFlips\n            (vOne, vRnd, vMin)\n\n        let timeIt () = \n            let stopWatch = System.Diagnostics.Stopwatch.StartNew()\n            let (vOne, vRnd, vMin) = runExperiements numberOfExperiements numberOfCoins numberOfFlips\n            stopWatch.Stop()\n            printfn "seconds: %f" (stopWatch.Elapsed.TotalMilliseconds / 1000.0)\n            printfn "vOne: %A" vOne\n            printfn "vRnd: %A" vRnd\n            printfn "vMin: %A" vMin\n\n        timeIt ()\n\n        printf "Press any key to exit: "\n        System.Console.ReadKey() |> ignore\n        printfn ""\n\n        0 // return an integer exit code\n
Run Code Online (Sandbox Code Playgroud)\n\n

=================================================== =====================

\n\n

这只是一个中间答案,因为我询问 OP 是否考虑使用 MathNet Numerics 惯用的 F#,并且 OP 想看看它是什么样的。在我的机器上运行他的版本和第一个剪辑版本后,OP 版本更快。OP:75秒,我的:84秒

\n\n
namespace Workspace\n\nopen MathNet.Numerics.LinearAlgebra\n\nmodule main =\n\n    [<EntryPoint>]\n    let main argv = \n\n        let rnd = System.Random()\n        let flipCoin() = \n            let head = rnd.NextDouble() > 0.5\n            if head then 1.0 else 0.0\n\n        let numberOfCoins = 1000\n        let numberOfFlips = 10\n        let numberOfExperiements = 100000\n        let numberOfValues = 3\n\n        let randomPick (limit : int) : int = rnd.Next(limit)   // [0 .. limit) it\'s a Python habit\n        let headCount (m : Matrix<float>) (coinIndex : int) : int = \n            System.Convert.ToInt32((m.Row coinIndex).Sum())\n\n        let minHeads (m : Matrix<float>) (numberOfCoins : int) (numberOfFlips : int) : int =\n            let rec findMinHeads currentCoinIndex minHeadsCount minHeadsIndex =\n                match currentCoinIndex,minHeadsCount with\n                | -1,_ -> minHeadsCount\n                | _,0 -> minHeadsCount  // Can\'t get less than zero so stop searching.\n                | _ ->\n                    let currentMinHeadCount = (headCount m currentCoinIndex)\n                    let nextIndex = currentCoinIndex - 1\n                    if currentMinHeadCount < minHeadsCount \n                    then findMinHeads nextIndex currentMinHeadCount currentCoinIndex\n                    else findMinHeads nextIndex minHeadsCount minHeadsIndex\n            findMinHeads (numberOfCoins - 1) numberOfFlips -1\n\n        // Return the values for cOne, cRnd, and cMin as int values. \n        // Will do division on final sum of experiments instead of after each experiment.\n        let runExperiement (numberOfCoins : int) (numberOfFlips : int) : (int * int * int) =        \n            let (flips : Matrix<float>) = DenseMatrix.init numberOfCoins numberOfFlips (fun i j -> flipCoin())\n            let cOne = headCount flips 0\n            let cRnd = headCount flips (randomPick numberOfCoins)\n            let cMin = minHeads flips numberOfCoins numberOfFlips\n            (cOne,cRnd,cMin)\n\n        let runExperiements (numberOfExperiements : int) (numberOfCoins : int) (numberOfFlips : int) : (int [] * int [] * int []) =\n            let (cOneArray : int[]) = Array.create numberOfExperiements 0\n            let (cRndArray : int[]) = Array.create numberOfExperiements 0\n            let (cMinArray : int[]) = Array.create numberOfExperiements 0\n            for i = 0 to (numberOfExperiements - 1) do\n                let (cOne,cRnd,cMin) = runExperiement numberOfCoins numberOfFlips\n                cOneArray.[i] <- cOne \n                cRndArray.[i] <- cRnd \n                cMinArray.[i] <- cMin \n            (cOneArray, cRndArray, cMinArray)\n\n        let (cOneArray, cRndArray, cMinArray) = runExperiements numberOfExperiements numberOfCoins numberOfFlips\n        let (vOne : double) = (double)(Array.sum cOneArray) / (double)numberOfExperiements / (double)numberOfFlips\n        let (vRnd : double) = (double)(Array.sum cRndArray) / (double)numberOfExperiements / (double)numberOfFlips\n        let (vMin : double) = (double)(Array.sum cMinArray) / (double)numberOfExperiements / (double)numberOfFlips\n\n        printfn "vOne: %A" vOne\n        printfn "vRnd: %A" vRnd\n        printfn "vMin: %A" vMin\n
Run Code Online (Sandbox Code Playgroud)\n\n

在编码过程中,我意识到我可以使用 完成所有计算int,只有最后的计算生成了需要为 afloat或 的百分比double,甚至那只是因为答案列表是一个百分比;理论上,可以对数字进行比较以int获得相同的理解。如果我只使用的int话,我就必须创建一个int矩阵类型,这比我想要做的工作更多。当我有时间时,我会将 MathNet Matrix 切换为 F# Array2D或类似的东西并进行检查。请注意,如果您使用此标记,MathNet则维护者MathNet可能会回答(Christoph R\xc3\xbcegg

\n\n

我对这个方法进行了修改,速度快了5秒。

\n\n
// faster\nlet minHeads (m : Matrix<float>) (numberOfCoins : int) (numberOfFlips : int) : int =\n    let (mins : float[]) = m.FoldByRow((fun (x : float) y -> x + y), 0.0)\n    let (minHead : float) = Array.min mins\n    System.Convert.ToInt32(minHead)\n
Run Code Online (Sandbox Code Playgroud)\n