一些运行/ JIT错误后,Java循环变慢了?

Sch*_*tod 11 java performance benchmarking jit

所以我想对一些基本的java功能进行基准测试,以便为这个问题添加一些信息:将方法声明为静态会带来什么好处.

我知道写作基准有时并不容易,但这里发生的事情我无法解释.

请注意,我并没有考虑如何解决这个问题,而是为什么会发生这种情况*

测试类:

public class TestPerformanceOfStaticVsDynamicCalls {

    private static final long RUNS = 1_000_000_000L;

    public static void main( String [] args ){

        new TestPerformanceOfStaticVsDynamicCalls().run();
    }

    private void run(){

        long r=0;
        long start, end;

        for( int loop = 0; loop<10; loop++ ){

            // Benchmark

            start = System.currentTimeMillis();
            for( long i = 0; i < RUNS; i++ ) {
                r += addStatic( 1, i );
            }
            end = System.currentTimeMillis();
            System.out.println( "Static: " + ( end - start ) + " ms" );

            start = System.currentTimeMillis();
            for( long i = 0; i < RUNS; i++ ) {
                r += addDynamic( 1, i );
            }
            end = System.currentTimeMillis();
            System.out.println( "Dynamic: " + ( end - start ) + " ms" );

            // Do something with r to keep compiler happy
            System.out.println( r );

        }

    }


    private long addDynamic( long a, long b ){

        return a+b;
    }

    private static long addStatic( long a, long b ){

        return a+b;
    }

}
Run Code Online (Sandbox Code Playgroud)

我期待第一个循环是热身,以下循环更快.

在Eclipse中运行它会产生以下奇怪的结果:

Static: 621 ms
Dynamic: 631 ms
1000000001000000000
Static: 2257 ms
Dynamic: 2501 ms
2000000002000000000
Static: 2258 ms
Dynamic: 2469 ms
3000000003000000000
Static: 2231 ms
Dynamic: 2464 ms
4000000004000000000
Run Code Online (Sandbox Code Playgroud)

那么wtf?它变慢了.要进行交叉检查,我使用java/c 7运行相同的代码:

Static: 620 ms
Dynamic: 627 ms
1000000001000000000
Static: 897 ms
Dynamic: 617 ms
2000000002000000000
Static: 901 ms
Dynamic: 615 ms
3000000003000000000
Static: 888 ms
Dynamic: 616 ms
4000000004000000000
Run Code Online (Sandbox Code Playgroud)

所以这里只有静态调用对于后续循环变慢.更奇怪的是,如果我重新排列代码只r在最后一个循环后打印我在Eclipse中得到它:

Static: 620 ms
Dynamic: 635 ms
Static: 2285 ms
Dynamic: 893 ms
Static: 2258 ms
Dynamic: 900 ms
Static: 2280 ms
Dynamic: 905 ms
4000000004000000000
Run Code Online (Sandbox Code Playgroud)

这与java/c 7:

Static: 620 ms
Dynamic: 623 ms
Static: 890 ms
Dynamic: 614 ms
Static: 890 ms
Dynamic: 616 ms
Static: 886 ms
Dynamic: 614 ms
4000000004000000000
Run Code Online (Sandbox Code Playgroud)

而在它改变eclipse中动态/静态基准测试的顺序时:

Dynamic: 618 ms
Static: 626 ms
1000000001000000000
Dynamic: 632 ms
Static: 2524 ms
2000000002000000000
Dynamic: 617 ms
Static: 2528 ms
3000000003000000000
Dynamic: 622 ms
Static: 2506 ms
4000000004000000000
Run Code Online (Sandbox Code Playgroud)

在java/c 7中:

Dynamic: 625 ms
Static: 646 ms
1000000001000000000
Dynamic: 2470 ms
Static: 633 ms
2000000002000000000
Dynamic: 2459 ms
Static: 635 ms
3000000003000000000
Dynamic: 2464 ms
Static: 645 ms
4000000004000000000
Run Code Online (Sandbox Code Playgroud)

那么这里发生了什么?

编辑:一些系统信息:

Java version "1.7.0_55"
OpenJDK Runtime Environment (IcedTea 2.4.7) (7u55-2.4.7-1ubuntu1)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz
Run Code Online (Sandbox Code Playgroud)

EDIT2:

使用Java8:

Static: 620 ms
Dynamic: 624 ms
1000000001000000000
Static: 890 ms
Dynamic: 618 ms
2000000002000000000
Static: 891 ms
Dynamic: 616 ms
3000000003000000000
Static: 892 ms
Dynamic: 617 ms
4000000004000000000
Run Code Online (Sandbox Code Playgroud)

其他代码排序在这里产生类似的奇怪(但是更好)的结果.

apa*_*gin 10

序言:手动编写微基准测试几乎总是注定要失败.
有些框架已经解决了常见的基准测试问题.

  1. JIT编译单元是一种方法.将几个基准合并到一个方法中会导致不可预测的结果.

  2. JIT严重依赖于执行配置文件,即运行时统计信息.如果方法长时间运行第一个场景,JIT将优化生成的代码.当方法突然切换到另一种情况时,不要指望它以相同的速度运行.

  3. JIT可能会跳过优化未执行的代码.它将为此代码留下一个不常见的陷阱.如果陷阱被击中,JVM将取消优化编译的方法,切换到解释器,然后用新知识重新编译代码.例如,当你的方法run第一次在第一个热循环中编译时,JIT还不知道System.out.println.一旦执行到达println,早期编译的代码很可能会被去优化.

  4. 方法越大 - 为JIT编译器优化它就越困难.例如,似乎没有足够的备用寄存器来保存所有局部变量.这就是你的情况.

总结一下,您的基准测试似乎通过以下场景:

  1. 第一个热循环(addStatic)触发run方法的编译.除addStatic方法外,执行配置文件不知道任何内容.
  2. System.out.println触发去优化,之后第二个热循环(addDynamic)导致重新编译run方法.
  3. 现在执行配置文件仅包含有关的信息addDynamic,因此JIT优化了第二个循环,并且第一个循环似乎有额外的寄存器溢出:

优化循环:

0x0000000002d01054: add    %rbx,%r14
0x0000000002d01057: add    $0x1,%rbx          ;*ladd
                                              ; - TestPerformanceOfStaticVsDynamicCalls::addDynamic@2
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@105

0x0000000002d0105b: add    $0x1,%r14          ; OopMap{rbp=Oop off=127}
                                              ;*goto
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@116

0x0000000002d0105f: test   %eax,-0x1c91065(%rip)        # 0x0000000001070000
                                              ;*lload
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@92
                                              ;   {poll}
0x0000000002d01065: cmp    $0x3b9aca00,%rbx
0x0000000002d0106c: jl     0x0000000002d01054
Run Code Online (Sandbox Code Playgroud)

循环使用额外的寄存器溢出:

0x0000000002d011d0: mov    0x28(%rsp),%r11  <---- the problem is here
0x0000000002d011d5: add    %r10,%r11
0x0000000002d011d8: add    $0x1,%r10
0x0000000002d011dc: add    $0x1,%r11
0x0000000002d011e0: mov    %r11,0x28(%rsp)    ;*ladd
                                              ; - TestPerformanceOfStaticVsDynamicCalls::addStatic@2
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@33

0x0000000002d011e5: mov    0x28(%rsp),%r11  <---- the problem is here
0x0000000002d011ea: add    $0x1,%r11          ; OopMap{[32]=Oop off=526}
                                              ;*goto
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@44

0x0000000002d011ee: test   %eax,-0x1c911f4(%rip)        # 0x0000000001070000
                                              ;*goto
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@44
                                              ;   {poll}
0x0000000002d011f4: cmp    $0x3b9aca00,%r10
0x0000000002d011fb: jl     0x0000000002d011d0  ;*ifge
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@25
Run Code Online (Sandbox Code Playgroud)

PS以下JVM选项对于分析JIT编译很有用:

-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining -XX:+PrintAssembly -XX:CompileOnly=TestPerformanceOfStaticVsDynamicCalls
Run Code Online (Sandbox Code Playgroud)