Cel*_*tas 3 java optimization performance fizzbuzz
哪个fizzbuzz实现效率更高?
public static void fizzBuzz1(int n)
{
boolean fizzed, buzzed;
for(int i = 1; i <= n; i++)
{
fizzed = buzzed = false;
if(i % 3 == 0)
fizzed = true;
if(i % 5 == 0)
buzzed = true;
if(fizzed && buzzed)
System.out.println("FizzBuzz");
else if(fizzed)
System.out.println("Fizz");
else if(buzzed)
System.out.println("Buzz");
else
System.out.println(i);
}
}
public static void fizzBuzz2(int n)
{
for(int i = 1; i <= n; i++)
{
if(i % 3 == 0 && i % 5 == 0)
System.out.println("FizzBuzz");
else if(i % 3 == 0)
System.out.println("Fizz");
else if(i % 5 == 0)
System.out.println("Buzz");
else
System.out.println(i);
}
}
Run Code Online (Sandbox Code Playgroud)
Mar*_*o13 11
这一次采取了一些有趣的转折.
首先,我试着看一下使用原始方法生成的程序集.但是,JIT做了相当多的内联和优化,这包括System.out.println调用,因此产生的程序集输出太大(对我来说)在合理的时间内对其进行合理分析.
所以我简化了整个过程,以便能够专注于实际的问题.最后,我运行了以下程序:
class Test04
{
public static void main(String args[])
{
long sum = 0;
for (int i=1000; i<12000; i++)
{
sum += fizzBuzz1(i);
sum += fizzBuzz2(i);
}
System.out.println(sum);
}
public static long fizzBuzz1(int n)
{
long sum = 0;
for(int i = 1; i <= n; i++)
{
sum += fizzBuzzCore1(i);
}
return sum;
}
public static long fizzBuzzCore1(int i)
{
boolean fizzed = false;
boolean buzzed = false;
if(i % 3 == 0)
fizzed = true;
if(i % 5 == 0)
buzzed = true;
if(fizzed && buzzed)
return 4;
else if(fizzed)
return 3;
else if(buzzed)
return 2;
else
return 1;
}
public static long fizzBuzz2(int n)
{
long sum = 0;
for(int i = 1; i <= n; i++)
{
sum += fizzBuzzCore2(i);
}
return sum;
}
public static long fizzBuzzCore2(int i)
{
if(i % 3 == 0 && i % 5 == 0)
return 4;
else if(i % 3 == 0)
return 3;
else if(i % 5 == 0)
return 2;
else
return 1;
}
}
Run Code Online (Sandbox Code Playgroud)
返回值旨在防止他完全优化掉调用,并提取旨在保持必须比较的组件输出的大小尽可能小的"核心"方法.
(注意:当然,这些修改可能会影响优化.例如,JIT有一个方法可能具有的字节码指令数量的限制,在它被认为太大而无法内联之前-XX:MaxInlineSize=35.但它的效果应该是两种方法大致相同,因此仍然可以得出关于实际问题的所需信息.
而且,并不是一个惊喜:在最后一次优化之后,两种方法的汇编代码都包含相同的指令 - 这里的汇编fizzBuzzCore1作为参考:
Decoding compiled method 0x00000000026c0090:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
# {method} {0x0000000057260528} 'fizzBuzzCore1' '(I)J' in 'Test04'
# parm0: rdx = int
# [sp+0x20] (sp of caller)
0x00000000026c01c0: sub $0x18,%rsp
0x00000000026c01c7: mov %rbp,0x10(%rsp) ;*synchronization entry
; - Test04::fizzBuzzCore1@-1 (line 27)
0x00000000026c01cc: movslq %edx,%r10
0x00000000026c01cf: mov %edx,%r11d
0x00000000026c01d2: sar $0x1f,%r11d ;*irem
; - Test04::fizzBuzzCore1@6 (line 29)
0x00000000026c01d6: imul $0x66666667,%r10,%r8
0x00000000026c01dd: imul $0x55555556,%r10,%r10
0x00000000026c01e4: sar $0x21,%r8
0x00000000026c01e8: sar $0x20,%r10
0x00000000026c01ec: mov %r8d,%r8d
0x00000000026c01ef: sub %r11d,%r8d ;*irem
; - Test04::fizzBuzzCore1@14 (line 31)
0x00000000026c01f2: mov %r10d,%r10d
0x00000000026c01f5: sub %r11d,%r10d ;*irem
; - Test04::fizzBuzzCore1@6 (line 29)
0x00000000026c01f8: mov %r8d,%r11d
0x00000000026c01fb: shl $0x2,%r11d
0x00000000026c01ff: add %r8d,%r11d ;*irem
; - Test04::fizzBuzzCore1@14 (line 31)
0x00000000026c0202: mov %r10d,%r9d
0x00000000026c0205: shl %r9d
0x00000000026c0208: add %r10d,%r9d ;*irem
; - Test04::fizzBuzzCore1@6 (line 29)
0x00000000026c020b: cmp %r9d,%edx
0x00000000026c020e: jne 0x00000000026c021c ;*ifeq
; - Test04::fizzBuzzCore1@21 (line 33)
0x00000000026c0210: cmp %r11d,%edx
0x00000000026c0213: jne 0x00000000026c021c ;*ifeq
; - Test04::fizzBuzzCore1@25 (line 33)
0x00000000026c0215: mov $0x4,%eax
0x00000000026c021a: jmp 0x00000000026c0239 ;*iload_1
; - Test04::fizzBuzzCore1@32 (line 35)
0x00000000026c021c: cmp %r9d,%edx
0x00000000026c021f: jne 0x00000000026c0228 ;*ifeq
; - Test04::fizzBuzzCore1@33 (line 35)
0x00000000026c0221: mov $0x3,%eax
0x00000000026c0226: jmp 0x00000000026c0239
0x00000000026c0228: cmp %r11d,%edx
0x00000000026c022b: jne 0x00000000026c0234 ;*ifeq
; - Test04::fizzBuzzCore1@41 (line 37)
0x00000000026c022d: mov $0x2,%eax
0x00000000026c0232: jmp 0x00000000026c0239
0x00000000026c0234: mov $0x1,%eax ;*irem
; - Test04::fizzBuzzCore1@6 (line 29)
0x00000000026c0239: add $0x10,%rsp
0x00000000026c023d: pop %rbp
0x00000000026c023e: test %eax,-0x2470244(%rip) # 0x0000000000250000
; {poll_return}
0x00000000026c0244: retq
0x00000000026c0245: hlt
0x00000000026c0246: hlt
0x00000000026c0247: hlt
0x00000000026c0248: hlt
0x00000000026c0249: hlt
0x00000000026c024a: hlt
0x00000000026c024b: hlt
0x00000000026c024c: hlt
0x00000000026c024d: hlt
0x00000000026c024e: hlt
0x00000000026c024f: hlt
0x00000000026c0250: hlt
0x00000000026c0251: hlt
0x00000000026c0252: hlt
0x00000000026c0253: hlt
0x00000000026c0254: hlt
0x00000000026c0255: hlt
0x00000000026c0256: hlt
0x00000000026c0257: hlt
0x00000000026c0258: hlt
0x00000000026c0259: hlt
0x00000000026c025a: hlt
0x00000000026c025b: hlt
0x00000000026c025c: hlt
0x00000000026c025d: hlt
0x00000000026c025e: hlt
0x00000000026c025f: hlt
[Exception Handler]
[Stub Code]
0x00000000026c0260: jmpq 0x000000000261c560 ; {no_reloc}
[Deopt Handler Code]
0x00000000026c0265: callq 0x00000000026c026a
0x00000000026c026a: subq $0x5,(%rsp)
0x00000000026c026f: jmpq 0x00000000025f6f40 ; {runtime_call}
0x00000000026c0274: hlt
0x00000000026c0275: hlt
0x00000000026c0276: hlt
0x00000000026c0277: hlt
Run Code Online (Sandbox Code Playgroud)
......有什么可以奇怪,却是:它不计算模操作了!
至少,没有明确说明:idiv此代码中没有出现任何指令!所以JIT真的努力避免代价高昂的分歧,通过做一些令人讨厌的,令人讨厌的苦涩的黑客攻击:指示
0x00000000026c01d6: imul $0x66666667,%r10,%r8
0x00000000026c01dd: imul $0x55555556,%r10,%r10
0x00000000026c01e4: sar $0x21,%r8
0x00000000026c01e8: sar $0x20,%r10
(and following...)
Run Code Online (Sandbox Code Playgroud)
是一个部门的"无分工"实施.例如,该方法
private static int divideBy3(int n)
{
long r10 = n;
r10 *= 0x55555556L;
r10 >>>= 0x20;
long r10d = r10 & 0xFFFFFFFFL;
return (int)r10d;
}
Run Code Online (Sandbox Code Playgroud)
使用这些魔术常数并移位以计算除以3(类似地,对于5除以另一个常数).我自己没有做过数学计算,但可以在Hacker's Delight的"INTEGER DIVISION BY CONSTANTS"文档的第32页找到关于模运算如何从中得出的解释.