OCaml优化技术

HKT*_*Lee 3 optimization performance ocaml functional-programming

我是OCaml的新手(在Haskell中有一些先验知识).我想说服自己采用OCaml.因此,我试图比较C和OCaml之间的性能.我写了以下天真的蒙特卡罗Pi-finder:

C版

#include <stdio.h>
#include <stdlib.h>

int main(int argc, const char * argv[]) {

    const int N = 10000000;
    const int M = 10000000;

    int count = 0;
    for (int i = 0; i < N; i++) {
        double x = (double)(random() % (2 * M + 1) - M) / (double)(M);
        double y = (double)(random() % (2 * M + 1) - M) / (double)(M);
        if (x * x + y * y <= 1) {
            count++;
        }
    }

    double pi_approx = 4.0 * (double)(count) / (double)(N);
    printf("pi .= %f", pi_approx);
    return 0;
}
Run Code Online (Sandbox Code Playgroud)

Ocaml版本

let findPi m n = 
    let rec countPi count = function 
        | 0 -> count
        | n ->
            let x = float_of_int (Random.int (2 * m + 1) - m) /. (float_of_int m) in
            let y = float_of_int (Random.int (2 * m + 1) - m) /. (float_of_int m) in
            if x *. x +. y *. y <= 1. then
                countPi (count + 1) (n - 1)
            else
                countPi count (n - 1) in
    4.0 *. (float_of_int (countPi 0 n)) /. (float_of_int n);;

let n = 10000000 in
let m = 10000000 in

let pi_approx = findPi m n in
Printf.printf "pi .= %f" pi_approx
Run Code Online (Sandbox Code Playgroud)

我使用Clang(Apple LLVM版本5.1)编译了C,使用ocamlopt v4.01.0编译了OCaml.

C的运行时间为0.105s.OCaml是0.945s,慢了9倍.我的目标是将OCaml的运行时间减少3倍,以便程序可以在0.315秒内完成.

由于我是OCaml的新手,我想学习一些OCaml优化技术.请给我一些建议!(已经应用了尾递归,或者程序将因stackoverflow崩溃)

Jef*_*eld 11

如果我在两个测试中使用相同的随机数生成器,这就是我所看到的.

这是从OCaml调用random()的存根:

#include <stdlib.h>

#include <caml/mlvalues.h>

value crandom(value v)
{
    return Val_int(random());
}
Run Code Online (Sandbox Code Playgroud)

这是修改后的OCaml代码:

external crandom : unit -> int = "crandom"

let findPi m n =
    let rec countPi count = function
        | 0 -> count
        | n ->
            let x = float_of_int (crandom () mod (2 * m + 1) - m) /. (float_of_int m) in
            let y = float_of_int (crandom () mod (2 * m + 1) - m) /. (float_of_int m) in
            if x *. x +. y *. y <= 1. then
                countPi (count + 1) (n - 1)
            else
                countPi count (n - 1) in
    4.0 *. (float_of_int (countPi 0 n)) /. (float_of_int n);;

let n = 10000000 in
let m = 10000000 in

let pi_approx = findPi m n in
Printf.printf "pi .= %f" pi_approx
Run Code Online (Sandbox Code Playgroud)

我还复制了你的C代码.

这是一个会话,显示我的Mac上的两个程序(2.3 GHz Intel Core i7):

$ time findpic
pi .= 3.140129
real    0m0.346s
user    0m0.343s
sys     0m0.002s
$ time findpic
pi .= 3.140129
real    0m0.342s
user    0m0.340s
sys     0m0.001s
$ time findpiml
pi .= 3.140129
real    0m0.396s
user    0m0.394s
sys     0m0.002s
$ time findpiml
pi .= 3.140129
real    0m0.395s
user    0m0.393s
sys     0m0.002s
Run Code Online (Sandbox Code Playgroud)

看起来OCaml代码慢了大约15%.

我没有尝试使它更快,我只是用C代码使用的那个替换随机数生成器.

您的代码实际上似乎很难改进(即,它是很好的代码).

编辑

(我重写了存根以使其更快.)