Ope*_*uce 15 optimization clojure numerical-computing type-hinting
我在Clojure中实现了一些基本的复数运算,并注意到它比大致相当的Java代码慢了大约10倍,即使是类型提示也是如此.
相比:
(defn plus [[^double x1 ^double y1] [^double x2 ^double y2]]
[(+ x1 x2) (+ y1 y2)])
(defn times [[^double x1 ^double y1] [^double x2 ^double y2]]
[(- (* x1 x2) (* y1 y2)) (+ (* x1 y2) (* y1 x2))])
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
Run Code Online (Sandbox Code Playgroud)
输出:
"Elapsed time: 69.429796 msecs"
"Elapsed time: 72.232479 msecs"
Run Code Online (Sandbox Code Playgroud)
有:
public static void main( String[] args ) {
double[] z1 = new double[] { 1, 0 };
double[] z2 = new double[] { 0, 1 };
double[] z3 = null;
long l_StartTimeMillis = System.currentTimeMillis();
for ( int i = 0; i < 100000; i++ ) {
z3 = plus( z1, z2 ); // assign result to dummy var to stop compiler from optimising the loop away
}
long l_EndTimeMillis = System.currentTimeMillis();
long l_TimeTakenMillis = l_EndTimeMillis - l_StartTimeMillis;
System.out.format( "Time taken: %d millis\n", l_TimeTakenMillis );
l_StartTimeMillis = System.currentTimeMillis();
for ( int i = 0; i < 100000; i++ ) {
z3 = times( z1, z2 );
}
l_EndTimeMillis = System.currentTimeMillis();
l_TimeTakenMillis = l_EndTimeMillis - l_StartTimeMillis;
System.out.format( "Time taken: %d millis\n", l_TimeTakenMillis );
doNothing( z3 );
}
private static void doNothing( double[] z ) {
}
public static double[] plus (double[] z1, double[] z2) {
return new double[] { z1[0] + z2[0], z1[1] + z2[1] };
}
public static double[] times (double[] z1, double[] z2) {
return new double[] { z1[0]*z2[0] - z1[1]*z2[1], z1[0]*z2[1] + z1[1]*z2[0] };
}
Run Code Online (Sandbox Code Playgroud)
输出:
Time taken: 6 millis
Time taken: 6 millis
Run Code Online (Sandbox Code Playgroud)
实际上,类型提示似乎没有什么区别:如果我删除它们,我会得到大致相同的结果.真正奇怪的是,如果我在没有 REPL的情况下运行Clojure脚本,结果会变慢:
"Elapsed time: 137.337782 msecs"
"Elapsed time: 214.213993 msecs"
Run Code Online (Sandbox Code Playgroud)
所以我的问题是:如何才能接近Java代码的性能?为什么地球上的表达式在没有REPL的情况下运行clojure时需要更长的时间来评估?
更新==============
大,使用deftype与在类型提示deftype,并在defnS和使用dotimes,而不是repeatedly给出的性能一样好或比Java版本更好.感谢你们俩.
(deftype complex [^double real ^double imag])
(defn plus [^complex z1 ^complex z2]
(let [x1 (double (.real z1))
y1 (double (.imag z1))
x2 (double (.real z2))
y2 (double (.imag z2))]
(complex. (+ x1 x2) (+ y1 y2))))
(defn times [^complex z1 ^complex z2]
(let [x1 (double (.real z1))
y1 (double (.imag z1))
x2 (double (.real z2))
y2 (double (.imag z2))]
(complex. (- (* x1 x2) (* y1 y2)) (+ (* x1 y2) (* y1 x2)))))
(println "Warm up")
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(println "Try with dorun")
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(println "Try with dotimes")
(time (dotimes [_ 100000]
(plus (complex. 1 0) (complex. 0 1))))
(time (dotimes [_ 100000]
(times (complex. 1 0) (complex. 0 1))))
Run Code Online (Sandbox Code Playgroud)
输出:
Warm up
"Elapsed time: 92.805664 msecs"
"Elapsed time: 164.929421 msecs"
"Elapsed time: 23.799012 msecs"
"Elapsed time: 32.841624 msecs"
"Elapsed time: 20.886101 msecs"
"Elapsed time: 18.872783 msecs"
Try with dorun
"Elapsed time: 19.238403 msecs"
"Elapsed time: 17.856938 msecs"
Try with dotimes
"Elapsed time: 5.165658 msecs"
"Elapsed time: 5.209027 msecs"
Run Code Online (Sandbox Code Playgroud)
mik*_*era 22
您表现迟缓的可能原因是:
^double)对你没有帮助:虽然你可以在正常的Clojure函数上有原始类型提示,但它们不适用于向量.有关更多详细信息,请参阅此博客文章关于加速原始算术.
如果你真的想在Clojure中使用快速复杂的数字,你可能需要使用它们来实现它们deftype,例如:
(deftype Complex [^double real ^double imag])
Run Code Online (Sandbox Code Playgroud)
然后使用此类型定义所有复杂函数.这将使您能够始终使用原始算法,并且应该大致相当于编写良好的Java代码的性能.