Java有两种检查两个布尔值是否不同的方法。您可以将它们与!=或与^(xor)进行比较。当然,这两个运算符在所有情况下都会产生相同的结果。尽管如此,将两者都包括在内还是很有道理的,例如,在“异或”与“不相等”之间有什么区别?。对于开发人员而言,根据上下文选择一个相对于另一个更有意义-有时“恰好是这些布尔值之一”读起来更好,而有时“这两个布尔值不同”则更好地传达了意图。因此,也许使用哪个应该是口味和风格的问题。
令我惊讶的是javac并没有完全一样对待它们!考虑此类:
class Test {
public boolean xor(boolean p, boolean q) {
return p ^ q;
}
public boolean inequal(boolean p, boolean q) {
return p != q;
}
}
Run Code Online (Sandbox Code Playgroud)
显然,这两种方法具有相同的可见行为。但是它们具有不同的字节码:
$ javap -c Test
Compiled from "Test.java"
class Test {
Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public boolean xor(boolean, boolean);
Code:
0: iload_1
1: iload_2
2: ixor
3: ireturn
public boolean inequal(boolean, boolean);
Code:
0: iload_1
1: iload_2
2: if_icmpeq 9
5: iconst_1
6: goto 10
9: iconst_0
10: ireturn
}
Run Code Online (Sandbox Code Playgroud)
如果我不得不猜测,我会说它的xor性能更好,因为它只返回比较的结果。增加跳跃和额外的负担似乎是浪费的工作。但是,我没有猜测,而是使用Clojure的“标准”基准测试工具对这两种方法的数十亿次调用进行了基准测试。它足够接近,虽然看起来xor快一点,但我对统计数据的了解还不够,无法说出结果是否显着:
user=> (let [t (Test.)] (bench (.xor t true false)))
Evaluation count : 4681301040 in 60 samples of 78021684 calls.
Execution time mean : 4.273428 ns
Execution time std-deviation : 0.168423 ns
Execution time lower quantile : 4.044192 ns ( 2.5%)
Execution time upper quantile : 4.649796 ns (97.5%)
Overhead used : 8.723577 ns
Found 2 outliers in 60 samples (3.3333 %)
low-severe 2 (3.3333 %)
Variance from outliers : 25.4745 % Variance is moderately inflated by outliers
user=> (let [t (Test.)] (bench (.inequal t true false)))
Evaluation count : 4570766220 in 60 samples of 76179437 calls.
Execution time mean : 4.492847 ns
Execution time std-deviation : 0.162946 ns
Execution time lower quantile : 4.282077 ns ( 2.5%)
Execution time upper quantile : 4.813433 ns (97.5%)
Overhead used : 8.723577 ns
Found 2 outliers in 60 samples (3.3333 %)
low-severe 2 (3.3333 %)
Variance from outliers : 22.2554 % Variance is moderately inflated by outliers
Run Code Online (Sandbox Code Playgroud)
有一些理由,更喜欢写在其他性能方面一个1?在某些情况下,其实现方式的差异使一种方法比另一种方法更合适?或者,有人知道为什么javac如此不同地实现这两个相同的操作吗?
1当然,我不会鲁ck地使用此信息进行微优化。我很好奇这一切如何运作。
好吧,我将很快提供 CPU 如何翻译它并更新帖子,但与此同时,您正在查看 waaaay 太小而无法在意。
Java 中的字节码并不表示方法执行的速度(或不执行),有两个 JIT 编译器会在它们足够热时使该方法看起来完全不同。也javac被称为一次它做的非常小的优化编译代码,真正的优化来自JIT。
我已经JMH为此使用了一些测试,C1仅使用编译器或替换C2为GraalVM或根本不使用JIT......(后面有很多测试代码,您可以跳过它并查看结果,这是使用jdk-12btw完成的)。这段代码使用的是JMH——事实上的工具,用于微基准测试的 Java 世界(如果手工完成,这是众所周知的容易出错)。
@Warmup(iterations = 10)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
public class BooleanCompare {
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder()
.include(BooleanCompare.class.getName())
.build();
new Runner(opt).run();
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
public boolean xor(BooleanExecutionPlan plan) {
return plan.booleans()[0] ^ plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
public boolean plain(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1, jvmArgsAppend = "-Xint")
public boolean xorNoJIT(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1, jvmArgsAppend = "-Xint")
public boolean plainNoJIT(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1, jvmArgsAppend = "-XX:-TieredCompilation")
public boolean xorC2Only(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1, jvmArgsAppend = "-XX:-TieredCompilation")
public boolean plainC2Only(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1, jvmArgsAppend = "-XX:TieredStopAtLevel=1")
public boolean xorC1Only(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1, jvmArgsAppend = "-XX:TieredStopAtLevel=1")
public boolean plainC1Only(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1,
jvmArgsAppend = {
"-XX:+UnlockExperimentalVMOptions",
"-XX:+EagerJVMCI",
"-Dgraal.ShowConfiguration=info",
"-XX:+UseJVMCICompiler",
"-XX:+EnableJVMCI"
})
public boolean xorGraalVM(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@Fork(value = 1,
jvmArgsAppend = {
"-XX:+UnlockExperimentalVMOptions",
"-XX:+EagerJVMCI",
"-Dgraal.ShowConfiguration=info",
"-XX:+UseJVMCICompiler",
"-XX:+EnableJVMCI"
})
public boolean plainGraalVM(BooleanExecutionPlan plan) {
return plan.booleans()[0] != plan.booleans()[1];
}
}
Run Code Online (Sandbox Code Playgroud)
结果:
BooleanCompare.plain avgt 2 3.125 ns/op
BooleanCompare.xor avgt 2 2.976 ns/op
BooleanCompare.plainC1Only avgt 2 3.400 ns/op
BooleanCompare.xorC1Only avgt 2 3.379 ns/op
BooleanCompare.plainC2Only avgt 2 2.583 ns/op
BooleanCompare.xorC2Only avgt 2 2.685 ns/op
BooleanCompare.plainGraalVM avgt 2 2.980 ns/op
BooleanCompare.xorGraalVM avgt 2 3.868 ns/op
BooleanCompare.plainNoJIT avgt 2 243.348 ns/op
BooleanCompare.xorNoJIT avgt 2 201.342 ns/op
Run Code Online (Sandbox Code Playgroud)
我不是一个多才多艺的人来阅读汇编程序,尽管我有时喜欢这样做......这里有一些有趣的事情。如果我们这样做:
仅使用 != 的 C1 编译器
/*
* run many iterations of this with :
* java -XX:+UnlockDiagnosticVMOptions
* -XX:TieredStopAtLevel=1
* "-XX:CompileCommand=print,com/so/BooleanCompare.compare"
* com.so.BooleanCompare
*/
public static boolean compare(boolean left, boolean right) {
return left != right;
}
Run Code Online (Sandbox Code Playgroud)
我们得到:
0x000000010d1b2bc7: push %rbp
0x000000010d1b2bc8: sub $0x30,%rsp ;*iload_0 {reexecute=0 rethrow=0 return_oop=0}
; - com.so.BooleanCompare::compare@0 (line 22)
0x000000010d1b2bcc: cmp %edx,%esi
0x000000010d1b2bce: mov $0x0,%eax
0x000000010d1b2bd3: je 0x000000010d1b2bde
0x000000010d1b2bd9: mov $0x1,%eax
0x000000010d1b2bde: and $0x1,%eax
0x000000010d1b2be1: add $0x30,%rsp
0x000000010d1b2be5: pop %rbp
Run Code Online (Sandbox Code Playgroud)
对我来说,这段代码有点明显:将 0 放入eax, compare (edx, esi)-> 如果不等于,则将 1 放入eax. 返回eax & 1。
带有 ^ 的 C1 编译器:
public static boolean compare(boolean left, boolean right) {
return left ^ right;
}
# parm0: rsi = boolean
# parm1: rdx = boolean
# [sp+0x40] (sp of caller)
0x000000011326e5c0: mov %eax,-0x14000(%rsp)
0x000000011326e5c7: push %rbp
0x000000011326e5c8: sub $0x30,%rsp ;*iload_0 {reexecute=0 rethrow=0 return_oop=0}
; - com.so.BooleanCompare::compare@0 (line 22)
0x000000011326e5cc: xor %rdx,%rsi
0x000000011326e5cf: and $0x1,%esi
0x000000011326e5d2: mov %rsi,%rax
0x000000011326e5d5: add $0x30,%rsp
0x000000011326e5d9: pop %rbp
Run Code Online (Sandbox Code Playgroud)
我真的不知道为什么and $0x1,%esi这里需要,否则这也很简单,我想。
但是如果我启用 C2 编译器,事情就会变得更有趣。
/**
* run with java
* -XX:+UnlockDiagnosticVMOptions
* -XX:CICompilerCount=2
* -XX:-TieredCompilation
* "-XX:CompileCommand=print,com/so/BooleanCompare.compare"
* com.so.BooleanCompare
*/
public static boolean compare(boolean left, boolean right) {
return left != right;
}
# parm0: rsi = boolean
# parm1: rdx = boolean
# [sp+0x20] (sp of caller)
0x000000011a2bbfa0: sub $0x18,%rsp
0x000000011a2bbfa7: mov %rbp,0x10(%rsp)
0x000000011a2bbfac: xor %r10d,%r10d
0x000000011a2bbfaf: mov $0x1,%eax
0x000000011a2bbfb4: cmp %edx,%esi
0x000000011a2bbfb6: cmove %r10d,%eax
0x000000011a2bbfba: add $0x10,%rsp
0x000000011a2bbfbe: pop %rbp
Run Code Online (Sandbox Code Playgroud)
我什至没有看到经典的结语push ebp; mov ebp, esp; sub esp, x,而是通过以下方式非常不寻常(至少对我而言):
sub $0x18,%rsp
mov %rbp,0x10(%rsp)
....
add $0x10,%rsp
pop %rbp
Run Code Online (Sandbox Code Playgroud)
再一次,比我更全能的人可以有希望地解释。否则它就像C1生成的更好版本:
xor %r10d,%r10d // put zero into r10d
mov $0x1,%eax // put 1 into eax
cmp %edx,%esi // compare edx and esi
cmove %r10d,%eax // conditionally move the contents of r10d into eax
Run Code Online (Sandbox Code Playgroud)
AFAIKcmp/cmove比cmp/je因为分支预测更好- 这至少是我读过的......
与 C2 编译器异或:
public static boolean compare(boolean left, boolean right) {
return left ^ right;
}
0x000000010e6c9a20: sub $0x18,%rsp
0x000000010e6c9a27: mov %rbp,0x10(%rsp)
0x000000010e6c9a2c: xor %edx,%esi
0x000000010e6c9a2e: mov %esi,%eax
0x000000010e6c9a30: and $0x1,%eax
0x000000010e6c9a33: add $0x10,%rsp
0x000000010e6c9a37: pop %rbp
Run Code Online (Sandbox Code Playgroud)
它确实看起来与C1编译器生成的几乎相同。
| 归档时间: |
|
| 查看次数: |
182 次 |
| 最近记录: |