Java for循环性能问题

kuk*_*das 21 java compiler-construction performance for-loop microbenchmark

考虑这个例子:

public static void main(final String[] args) {
    final List<String> myList = Arrays.asList("A", "B", "C", "D");
    final long start = System.currentTimeMillis();
    for (int i = 1000000; i > myList.size(); i--) {
        System.out.println("Hello");
    }
    final long stop = System.currentTimeMillis();
    System.out.println("Finish: " + (stop - start));
}
Run Code Online (Sandbox Code Playgroud)

VS

public static void main(final String[] args) {
    final List<String> myList = Arrays.asList("A", "B", "C", "D");
    final long start = System.currentTimeMillis();
    final int size = myList.size();
    for (int i = 1000000; i > size; i--) {
        System.out.println("Hello");
    }
    final long stop = System.currentTimeMillis();
    System.out.println("Finish: " + (stop - start));
}
Run Code Online (Sandbox Code Playgroud)

这有什么不同吗?在我的机器上,第二个似乎表现得更快,但我不知道它是否真的准确.编译器会优化此代码吗?如果循环条件是一个不可变对象(例如String数组),我可以认为他会这么做.

Rex*_*err 25

如果你想测试这样的东西,你必须优化你的微基准测量来衡量你关心的东西.

首先,使循环便宜不可能跳过.计算总和通常可以解决问题.

其次,比较两个时间.

这里有一些代码可以同时执行:

import java.util.*;

public class Test {

public static long run1() {
  final List<String> myList = Arrays.asList("A", "B", "C", "D");
  final long start = System.nanoTime();
  int sum = 0;
  for (int i = 1000000000; i > myList.size(); i--) sum += i;
  final long stop = System.nanoTime();
  System.out.println("Finish: " + (stop - start)*1e-9 + " ns/op; sum = " + sum);
  return stop-start;
}

public static long run2() {
  final List<String> myList = Arrays.asList("A", "B", "C", "D");
  final long start = System.nanoTime();
  int sum = 0;
  int limit = myList.size();
  for (int i = 1000000000; i > limit; i--) sum += i;
  final long stop = System.nanoTime();
  System.out.println("Finish: " + (stop - start)*1e-9 + " ns/op; sum = " + sum);
  return stop-start;
}

public static void main(String[] args) {
  for (int i=0 ; i<5 ; i++) {
    long t1 = run1();
    long t2 = run2();
    System.out.println("  Speedup = " + (t1-t2)*1e-9 + " ns/op\n");
  }
}

}
Run Code Online (Sandbox Code Playgroud)

如果我们运行它,在我的系统上我们得到:

Finish: 0.481741256 ns/op; sum = -243309322
Finish: 0.40228402 ns/op; sum = -243309322
  Speedup = 0.079457236 ns/op

Finish: 0.450627151 ns/op; sum = -243309322
Finish: 0.43534661700000005 ns/op; sum = -243309322
  Speedup = 0.015280534 ns/op

Finish: 0.47738474700000005 ns/op; sum = -243309322
Finish: 0.403698331 ns/op; sum = -243309322
  Speedup = 0.073686416 ns/op

Finish: 0.47729349600000004 ns/op; sum = -243309322
Finish: 0.405540508 ns/op; sum = -243309322
  Speedup = 0.071752988 ns/op

Finish: 0.478979617 ns/op; sum = -243309322
Finish: 0.36067492700000003 ns/op; sum = -243309322
  Speedup = 0.11830469 ns/op
Run Code Online (Sandbox Code Playgroud)

这意味着方法调用的开销约为0.1 ns.如果你的循环执行的时间不超过1-2 ns,那么你应该关心这一点.否则,不要.


duf*_*ymo 10

就个人而言,我认为你不能从这样一个人为的例子中得出任何有意义的结论.

但是如果你真的想知道,为什么不使用javap反编译代码并看看有什么不同呢?为什么猜测编译器在你不知道的时候可以自己看到什么呢?

第一种情况的字节代码:

public class Stackoverflow extends java.lang.Object{
public Stackoverflow();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   iconst_4
   1:   anewarray       #2; //class java/lang/String
   4:   dup
   5:   iconst_0
   6:   ldc     #3; //String A
   8:   aastore
   9:   dup
   10:  iconst_1
   11:  ldc     #4; //String B
   13:  aastore
   14:  dup
   15:  iconst_2
   16:  ldc     #5; //String C
   18:  aastore
   19:  dup
   20:  iconst_3
   21:  ldc     #6; //String D
   23:  aastore
   24:  invokestatic    #7; //Method java/util/Arrays.asList:([Ljava/lang/Object;)Ljava/util/List
   27:  astore_1
   28:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   31:  lstore_2
   32:  ldc     #9; //int 1000000
   34:  istore  4
   36:  iload   4
   38:  aload_1
   39:  invokeinterface #10,  1; //InterfaceMethod java/util/List.size:()I
   44:  if_icmple       61
   47:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   50:  ldc     #12; //String Hello
   52:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   55:  iinc    4, -1
   58:  goto    36
   61:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   64:  lstore  4
   66:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   69:  new     #14; //class java/lang/StringBuilder
   72:  dup
   73:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   76:  ldc     #16; //String Finish:
   78:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/la
   81:  lload   4
   83:  lload_2
   84:  lsub
   85:  invokevirtual   #18; //Method java/lang/StringBuilder.append:(J)Ljava/lang/StringBuilder;
   88:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   91:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   94:  return
}
Run Code Online (Sandbox Code Playgroud)

第二种情况的字节代码:

public class Stackoverflow extends java.lang.Object{
public Stackoverflow();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   iconst_4
   1:   anewarray       #2; //class java/lang/String
   4:   dup
   5:   iconst_0
   6:   ldc     #3; //String A
   8:   aastore
   9:   dup
   10:  iconst_1
   11:  ldc     #4; //String B
   13:  aastore
   14:  dup
   15:  iconst_2
   16:  ldc     #5; //String C
   18:  aastore
   19:  dup
   20:  iconst_3
   21:  ldc     #6; //String D
   23:  aastore
   24:  invokestatic    #7; //Method java/util/Arrays.asList:([Ljava/lang/Object;)Ljava/util/List;
   27:  astore_1
   28:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   31:  lstore_2
   32:  aload_1
   33:  invokeinterface #9,  1; //InterfaceMethod java/util/List.size:()I
   38:  istore  4
   40:  ldc     #10; //int 1000000
   42:  istore  5
   44:  iload   5
   46:  iload   4
   48:  if_icmple       65
   51:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   54:  ldc     #12; //String Hello
   56:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   59:  iinc    5, -1
   62:  goto    44
   65:  invokestatic    #8; //Method java/lang/System.currentTimeMillis:()J
   68:  lstore  5
   70:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   73:  new     #14; //class java/lang/StringBuilder
   76:  dup
   77:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   80:  ldc     #16; //String Finish:
   82:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   85:  lload   5
   87:  lload_2
   88:  lsub
   89:  invokevirtual   #18; //Method java/lang/StringBuilder.append:(J)Ljava/lang/StringBuilder;
   92:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   95:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   98:  return
}
Run Code Online (Sandbox Code Playgroud)

存在差异,但我不确定我是否可以就其对性能的影响做出明确的陈述.

我会编写第二个代码,因为它表示(在它的表面上)一个方法调用,而不是每个循环迭代一个.我不知道编译器是否可以优化它,但我确信我可以很容易地做到这一点.所以我这样做,不管它对墙上时间的影响.


Tof*_*eer 9

我曾经做过一个项目,我的第一个任务是追踪一些疯狂的慢速代码(它是在一台全新的486机器上,执行大约需要20分钟):

for(size_t i = 0; i < strlen(data); i++)
{
    // do something with data[i]
}
Run Code Online (Sandbox Code Playgroud)

解决方案是(达到两分钟或更短的时间):

size_t length = strlen(data);

for(int i = 0; i < length; i++)
{
    // do something with data[i]
}
Run Code Online (Sandbox Code Playgroud)

问题是"数据"超过100万个字符,而strlen必须始终统计每个字符.

在Java的情况下,"size()"方法可能返回一个变量,因此,VM将内联它.在类似于Android上的VM上,它可能没有.所以答案是"它取决于".

我个人的偏好是,如果每次都应该返回相同的结果,就不要多次调用一个方法.这样,如果该方法确实涉及计算,则仅执行一次,然后它永远不会成为问题.


Tho*_*nin 5

请注意,javac编译器与优化无关."重要"编译器是JVM编译器,它位于JVM中.

在您的示例中,在最通用的情况下,myList.size()是一个简单的方法分派,它返回List实例中字段的内容.与隐含的内容相比,这是微不足道的工作System.out.println("Hello")(至少一个系统调用,因此数百个时钟周期,相比之下,方法调度不超过十几个).我非常怀疑你的代码在速度上会有显着的差异.

在更一般的基础上,JIT编译器应该将此调用识别size()为对已知实例的调用,以便它可以使用直接函数调用(更快)执行方法调度,甚至内联size()方法调用,从而减少调用简单的实例字段访问.