Sye*_*san 12 javascript performance loops v8 spidermonkey
我正在通过Eloquent JavaScript(再次)并遇到了第2章的练习"国际象棋棋盘".我第一次阅读它的时候写了一个不错的解决方案,并在Elequent Javascript网站上提供了另一个版本的解决方案.我是那些想成为超级高效程序员的新手之一,头脑中只有一个问题:"我能不能让它更快或更小?"
因此,在几个月前我在网上搜索时,我遇到了一个关于Stack Overflow 的问题,关于基于性能的for
循环vs while
循环.因为在那个线程中提到for
循环比慢,while
并且循环与递减迭代器更快所以我重写了代码以获得更好的性能.
这是for
替换的新版本while
和为减少编辑的条件:
console.time("looping");
var gridSize = 5000, str = '', i = gridSize, j;
while (i--) {
j = gridSize;
while (j--) {
if ((i - j) % 2 === 0)
str += " ";
else
str += "#";
}
str += "\n";
}
//console.log(str);
console.timeEnd("looping");
Run Code Online (Sandbox Code Playgroud)
但令我惊讶的是,这段代码的性能更差.然后,过了一会儿我发现那if ((i - j) % 2 === 0)
是主要的罪魁祸首,用plus替换减号将执行的总时间减少到~ 750ms
//Checked on NODE.js = v6.11.2
Book version of code --> 893.76 ms
while loop with subtraction --> 1562.43 ms
while loop with addition --> 749.62 ms
//firefox Benchmark v54.0 (64-bit) (OS Ubuntu 16.04)
Book version of code --> 725.10 ms
while loop with subtraction --> 1565.29 ms
while loop with addition --> 601.12 ms
Run Code Online (Sandbox Code Playgroud)
为什么减法会对总执行时间产生如此巨大的影响?
在看了@jaromandaX后,我很确定这不是减慢这个循环的减速,它是负数的模数.我想再次知道是什么让负数的模数更慢
这远非完整答案,需要进一步调查(或者了解了解V8实施细节的人的见解).不过,这是我的发现:
Sidenode:使用以下命令行收集运行Node.JS的结果:
node --expose-gc --print-code --code-comments --print-opt-code --trace-hydrogen --redirect-code-traces --redirect-code-traces-to = code.asm - trace_representation --trace-deopt --trace-opt 1.js
并稍微研究一下V8源代码.
1.性能差异来自于在这种情况下生成不同的机器代码的事实.对于+
代码%
是
;;; <@134,#123> add-i
00000151A32DD74B 395 03c2 addl rax,rdx
00000151A32DD74D 397 0f807a030000 jo 1293 (00000151A32DDACD)
;;; <@136,#126> mod-by-power-of-2-i
00000151A32DD753 403 85c0 testl rax,rax
00000151A32DD755 405 790f jns 422 (00000151A32DD766)
00000151A32DD757 407 f7d8 negl rax
00000151A32DD759 409 83e001 andl rax,0x1
00000151A32DD75C 412 f7d8 negl rax
00000151A32DD75E 414 0f846e030000 jz 1298 (00000151A32DDAD2)
00000151A32DD764 420 eb03 jmp 425 (00000151A32DD769)
00000151A32DD766 422 83e001 andl rax,0x1
;;; <@138,#200> smi-tag
00000151A32DD769 425 8bd8 movl rbx,rax
00000151A32DD76B 427 48c1e320 REX.W shlq rbx, 32
;;; <@140,#130> gap
00000151A32DD76F 431 488bc2 REX.W movq rax,rdx
Run Code Online (Sandbox Code Playgroud)
而-
代码却复杂得多:
;;; <@136,#123> sub-i
00000151A32E57E1 417 412bc3 subl rax,r11
00000151A32E57E4 420 0f8039040000 jo 1507 (00000151A32E5C23)
;;; <@138,#200> int32-to-double
00000151A32E57EA 426 c5f957c0 vxorpd xmm0,xmm0,xmm0
00000151A32E57EE 430 c5fb2ac0 vcvtlsi2sd xmm0,xmm0,rax
;;; <@139,#200> gap
00000151A32E57F2 434 c5f928ca vmovapd xmm1,xmm2
;;; <@140,#126> mod-d
00000151A32E57F6 438 4989e2 REX.W movq r10,rsp
00000151A32E57F9 441 4883ec28 REX.W subq rsp,0x28
00000151A32E57FD 445 4883e4f0 REX.W andq rsp,0xf0
00000151A32E5801 449 4c89542420 REX.W movq [rsp+0x20],r10
00000151A32E5806 454 48b830d4124001000000 REX.W movq rax,000000014012D430
00000151A32E5810 464 ffd0 call rax
00000151A32E5812 466 488b642420 REX.W movq rsp,[rsp+0x20]
;;; <@142,#126> lazy-bailout
;;; <@144,#202> number-tag-d
00000151A32E5817 471 498b9dc06f0400 REX.W movq rbx,[r13+0x46fc0]
00000151A32E581E 478 488bc3 REX.W movq rax,rbx
00000151A32E5821 481 4883c010 REX.W addq rax,0x10
00000151A32E5825 485 493b85c86f0400 REX.W cmpq rax,[r13+0x46fc8]
00000151A32E582C 492 0f878f030000 ja 1409 (00000151A32E5BC1)
00000151A32E5832 498 498985c06f0400 REX.W movq [r13+0x46fc0],rax
00000151A32E5839 505 48ffc3 REX.W incq rbx
00000151A32E583C 508 4d8b5560 REX.W movq r10,[r13+0x60]
00000151A32E5840 512 4c8953ff REX.W movq [rbx-0x1],r10
00000151A32E5844 516 c5fb114307 vmovsd [rbx+0x7],xmm0
;;; <@146,#130> gap
00000151A32E5849 521 488b45a0 REX.W movq rax,[rbp-0x60]
00000151A32E584D 525 488b7db8 REX.W movq rdi,[rbp-0x48]
00000151A32E5851 529 488b75c0 REX.W movq rsi,[rbp-0x40]
00000151A32E5855 533 488b4dc8 REX.W movq rcx,[rbp-0x38]
00000151A32E5859 537 488b55b0 REX.W movq rdx,[rbp-0x50]
00000151A32E585D 541 4c8b4da8 REX.W movq r9,[rbp-0x58]
00000151A32E5861 545 4c8b4598 REX.W movq r8,[rbp-0x68]
00000151A32E5865 549 c5fb109578ffffff vmovsd xmm2,[rbp-0x88]
Run Code Online (Sandbox Code Playgroud)
简而言之,不同之处在于对于"加"情况,Mod(%
)操作是使用高度专业化的mod-by-power-of-2-i
机器代码执行的,但对于"减"情况,使用mod-d
基于浮点的算术实现来完成.
另请注意,mod-by-power-of-2-i
机器代码确实支持负值.它可以粗略地重写为这样的东西:
if (rax < 0) {
rax = -rax;
rax = rax & 1;
rax = -rax;
}
else {
rax = rax & 1;
}
Run Code Online (Sandbox Code Playgroud)
因此,这不是仅针对正值的优化机器代码的情况.
2.生成代码的差异似乎来自类型推断以不同方式工作的事实.生成的日志--trace_representation
显示简化程序的"加"和"减"情况之间的以下差异:
var f_minus = function(log) {
var str = '', i = gridSize, j;
while (i--) {
j = gridSize;
while (j--) {
var ttt = (i - j) % 2
}
}
if(log) {
if(ttt == -1)
console.log(t);
}
}
var f_plus = function(log) {
var str = '', i = gridSize, j;
while (i--) {
j = gridSize;
while (j--) {
var ttt = (i + j) % 2
}
}
if(log){
if(ttt == -1)
console.log(t);
}
}
Run Code Online (Sandbox Code Playgroud)
相比
[marking 00000025D4303E91 <JS Function f_minus (SharedFunctionInfo 00000278933F61C1)> for optimized recompilation, reason: small function, ICs with typeinfo: 8/12 (66%), generic ICs: 0/12 (0%)]
[compiling method 00000025D4303E91 <JS Function f_minus (SharedFunctionInfo 00000278933F61C1)> using Crankshaft OSR]
#37 Phi is used by real #110 Branch as v
#38 Phi is used by real #58 Add as t
#38 Phi is used by real #69 StackCheck as v
#38 Phi is used by real #70 LoadContextSlot as v
#38 Phi is used by real #122 CompareGeneric as t
#38 Phi is used by real #132 LoadGlobalGeneric as v
#38 Phi is used by real #134 LoadNamedGeneric as v
#38 Phi is used by real #136 LoadGlobalGeneric as v
#38 Phi is used by real #141 CallWithDescriptor as v
#38 Phi is used by real #156 Return as v
#38 Phi is used by real #101 Mod as t
#38 Phi is used by real #98 Sub as t
#38 Phi is used by real #95 StackCheck as v
#38 Phi is used by real #84 Add as t
#47 Phi is used by real #56 ForceRepresentation as s
#49 Phi is used by real #122 CompareGeneric as t
#77 Phi is used by real #83 ForceRepresentation as s
generalizing use representation 'v' of #40 Phi with uses of #47 Phi 's'
generalizing use representation 'v' of #42 Phi with uses of #49 Phi 't'
generalizing use representation 't' of #42 Phi with uses of #78 Phi 'v'
generalizing use representation 'v' of #49 Phi with uses of #78 Phi 'v'
generalizing use representation 'v' of #78 Phi with uses of #49 Phi 't'
Changing #101 Mod representation v -> i based on inputs
Changing #101 Mod representation i -> d based on output
Changing #98 Sub representation v -> s based on inputs
Changing #98 Sub representation s -> i based on use requirements
Changing #84 Add representation v -> i based on inputs
...
Run Code Online (Sandbox Code Playgroud)
对此
[marking 000002C17CAAB341 <JS Function f_plus (SharedFunctionInfo 00000278933F6289)> for optimized recompilation, reason: small function, ICs with typeinfo: 8/12 (66%), generic ICs: 0/12 (0%)]
[compiling method 000002C17CAAB341 <JS Function f_plus (SharedFunctionInfo 00000278933F6289)> using Crankshaft OSR]
#37 Phi is used by real #110 Branch as v
#38 Phi is used by real #58 Add as t
#38 Phi is used by real #69 StackCheck as v
#38 Phi is used by real #70 LoadContextSlot as v
#38 Phi is used by real #122 CompareGeneric as t
#38 Phi is used by real #132 LoadGlobalGeneric as v
#38 Phi is used by real #134 LoadNamedGeneric as v
#38 Phi is used by real #136 LoadGlobalGeneric as v
#38 Phi is used by real #141 CallWithDescriptor as v
#38 Phi is used by real #156 Return as v
#38 Phi is used by real #101 Mod as t
#38 Phi is used by real #98 Add as t
#38 Phi is used by real #95 StackCheck as v
#38 Phi is used by real #84 Add as t
#47 Phi is used by real #56 ForceRepresentation as s
#49 Phi is used by real #122 CompareGeneric as t
#77 Phi is used by real #83 ForceRepresentation as s
generalizing use representation 'v' of #40 Phi with uses of #47 Phi 's'
generalizing use representation 'v' of #42 Phi with uses of #49 Phi 't'
generalizing use representation 't' of #42 Phi with uses of #78 Phi 'v'
generalizing use representation 'v' of #49 Phi with uses of #78 Phi 'v'
generalizing use representation 'v' of #78 Phi with uses of #49 Phi 't'
Changing #101 Mod representation v -> i based on inputs
Changing #98 Add representation v -> s based on inputs
Changing #98 Add representation s -> i based on use requirements
Changing #84 Add representation v -> i based on inputs
...
Run Code Online (Sandbox Code Playgroud)
有趣的区别是线
Changing #101 Mod representation i -> d based on output
Run Code Online (Sandbox Code Playgroud)
这只存在于f_minus
但不是这种f_plus
情况.由于某种原因,编译器认为在f_minus
案例类型中应该是Double而不是Integer,基于输出值的猜测.有趣的是,如果我改变了这条线
var ttt = (i - j) % 2
Run Code Online (Sandbox Code Playgroud)
至
var ttt = (i - j + gridSize) % 2;
Run Code Online (Sandbox Code Playgroud)
它再次开始生成快速mod-by-power-of-2-i
代码.所以是的,看起来输出值可能会影响优化编译器.但目前尚不清楚为什么会出现这种情况.
乍一看,这种行为看起来像是优化编译器中的一个错误,它没有注意到"减"的情况也可以处理mod-by-power-of-2-i
.我无法追踪为什么会发生这种情况只是浏览源代码,因此欢迎进一步的输入.
而不是使用昂贵的模块化操作
((i - j) % 2 === 0)
Run Code Online (Sandbox Code Playgroud)
你可以使用按位运算
(((i-j)&1) === 0)
Run Code Online (Sandbox Code Playgroud)
正如 SBS 在评论中建议的那样,你也应该尝试
(((i^j)&1) === 0)
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
613 次 |
最近记录: |