Ste*_*sbo 13 java performance new-operator
我有两种方法可以读取字符串,并创建Character对象:
static void newChar(String string) {
int len = string.length();
System.out.println("Reading " + len + " characters");
for (int i = 0; i < len; i++) {
Character cur = new Character(string.charAt(i));
}
}
Run Code Online (Sandbox Code Playgroud)
和
static void justChar(String string) {
int len = string.length();
for (int i = 0; i < len; i++) {
Character cur = string.charAt(i);
}
}
Run Code Online (Sandbox Code Playgroud)
当我使用18,554,760字符串运行方法时,我的运行时间差异很大.我得到的输出是:
newChar took: 20 ms
justChar took: 41 ms
Run Code Online (Sandbox Code Playgroud)
对于较小的输入(4,638,690个字符),时间不会变化.
newChar took: 12 ms
justChar took: 13 ms
Run Code Online (Sandbox Code Playgroud)
在这种情况下,为什么新的效率更高?
编辑:
我的基准代码非常hacky.
start = System.currentTimeMillis();
newChar(largeString);
end = System.currentTimeMillis();
diff = end-start;
System.out.println("New char took: " + diff + " ms");
start = System.currentTimeMillis();
justChar(largeString);
end = System.currentTimeMillis();
diff = end-start;
System.out.println("just char took: " + diff+ " ms");
Run Code Online (Sandbox Code Playgroud)
Ale*_*lev 22
好吧,我不确定Marko是否故意复制原来的错误.TL; DR; 新实例未被使用,被淘汰.调整基准可以反转结果.不要相信错误的基准,从中学习.
这是JMH基准:
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 3, time = 1)
@Measurement(iterations = 3, time = 1)
@Fork(3)
@State(Scope.Thread)
public class Chars {
// Source needs to be @State field to avoid constant optimizations
// on sources. Results need to be sinked into the Blackhole to
// avoid dead-code elimination
private String string;
@Setup
public void setup() {
string = "12345678901234567890";
for (int i = 0; i < 10; i++) {
string += string;
}
}
@GenerateMicroBenchmark
public void newChar_DCE(BlackHole bh) {
int len = string.length();
for (int i = 0; i < len; i++) {
Character c = new Character(string.charAt(i));
}
}
@GenerateMicroBenchmark
public void justChar_DCE(BlackHole bh) {
int len = string.length();
for (int i = 0; i < len; i++) {
Character c = Character.valueOf(string.charAt(i));
}
}
@GenerateMicroBenchmark
public void newChar(BlackHole bh) {
int len = string.length();
for (int i = 0; i < len; i++) {
Character c = new Character(string.charAt(i));
bh.consume(c);
}
}
@GenerateMicroBenchmark
public void justChar(BlackHole bh) {
int len = string.length();
for (int i = 0; i < len; i++) {
Character c = Character.valueOf(string.charAt(i));
bh.consume(c);
}
}
@GenerateMicroBenchmark
public void newChar_prim(BlackHole bh) {
int len = string.length();
for (int i = 0; i < len; i++) {
char c = new Character(string.charAt(i));
bh.consume(c);
}
}
@GenerateMicroBenchmark
public void justChar_prim(BlackHole bh) {
int len = string.length();
for (int i = 0; i < len; i++) {
char c = Character.valueOf(string.charAt(i));
bh.consume(c);
}
}
}
Run Code Online (Sandbox Code Playgroud)
......这就是结果:
Benchmark Mode Samples Mean Mean error Units
o.s.Chars.justChar avgt 9 93.051 0.365 us/op
o.s.Chars.justChar_DCE avgt 9 62.018 0.092 us/op
o.s.Chars.justChar_prim avgt 9 82.897 0.440 us/op
o.s.Chars.newChar avgt 9 117.962 4.679 us/op
o.s.Chars.newChar_DCE avgt 9 25.861 0.102 us/op
o.s.Chars.newChar_prim avgt 9 41.334 0.183 us/op
Run Code Online (Sandbox Code Playgroud)
DCE代表"死代码消除",这就是原始基准所遭受的损失.如果我们消除这种影响,以JMH的方式要求我们将值吸入Blackhole,分数会反转.所以,回想起来,这似乎表明new Character()原始代码在DCE方面有了重大改进,而Character.valueOf不是那么成功.我不确定我们应该讨论为什么,因为这与现实世界的用例无关,实际使用的是生成的角色.
你可以从这里走两条路:
UPD:关注Marko的问题,似乎主要的影响是消除分配本身,无论是通过EA还是DCE,请参阅*_prim测试.
UPD2:看着集会.同样的运行-XX:-DoEscapeAnalysis确认主要影响是由于消除了分配,因为逃逸分析的效果:
Benchmark Mode Samples Mean Mean error Units
o.s.Chars.justChar avgt 9 94.318 4.525 us/op
o.s.Chars.justChar_DCE avgt 9 61.993 0.227 us/op
o.s.Chars.justChar_prim avgt 9 82.824 0.634 us/op
o.s.Chars.newChar avgt 9 118.862 1.096 us/op
o.s.Chars.newChar_DCE avgt 9 97.530 2.485 us/op
o.s.Chars.newChar_prim avgt 9 101.905 1.871 us/op
Run Code Online (Sandbox Code Playgroud)
这证明了原始的DCE猜想是不正确的.EA是主要贡献者. DCE结果仍然更快,因为我们不支付拆箱的成本,并且通常在任何方面处理返回的值.然而,基准在这方面是错误的.
您的测量确实会产生实际效果.
它主要是偶然的,因为你的基准测试有许多技术缺陷,它暴露的效果可能不是你想到的那个.
当且仅当 HotSpot的Escape分析成功证明生成的实例可以安全地分配到堆栈而不是堆上时,该new Character()方法更快.因此,效果并不像你的问题所暗示的那样普遍.
new Character()更快的原因是引用的位置:您的实例位于堆栈上,对它的所有访问都是通过CPU缓存命中.当您重用缓存实例时,您必须
static字段;Character实例中的数组条目;char该实例中包含的内容.每个解除引用都是潜在的CPU缓存未命中.此外,它强制将高速缓存的一部分重定向到那些远程位置,从而导致输入字符串和/或堆栈位置上的更多高速缓存未命中.
我运行此代码jmh:
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@BenchmarkMode(Mode.AverageTime)
public class Chars {
static String string = "12345678901234567890"; static {
for (int i = 0; i < 10; i++) string += string;
}
@GenerateMicroBenchmark
public void newChar() {
int len = string.length();
for (int i = 0; i < len; i++) new Character(string.charAt(i));
}
@GenerateMicroBenchmark
public void justChar() {
int len = string.length();
for (int i = 0; i < len; i++) Character.valueOf(string.charAt(i));
}
}
Run Code Online (Sandbox Code Playgroud)
这保留了代码的本质,但消除了一些系统错误,如预热和编译时间.这些是结果:
Benchmark Mode Thr Cnt Sec Mean Mean error Units
o.s.Chars.justChar avgt 1 3 5 39.062 6.587 usec/op
o.s.Chars.newChar avgt 1 3 5 19.114 0.653 usec/op
Run Code Online (Sandbox Code Playgroud)
这将是我对正在发生的事情的最好猜测:
在newChar你正在创建一个新的实例Character.HotSpot的Escape Analysis可以证明实例永远不会逃脱,因此它允许堆栈分配,或者在特殊情况下Character,可以完全消除分配,因为来自它的数据可以证明从未使用过;
在justChar你涉及查找Character缓存数组,这有一些成本.
为了回应Aleks的批评,我在基准测试中添加了更多方法.主要效果保持稳定,但我们得到更细微的细节关于较小的优化效果.
@GenerateMicroBenchmark
public int newCharUsed() {
int len = string.length(), sum = 0;
for (int i = 0; i < len; i++) sum += new Character(string.charAt(i));
return sum;
}
@GenerateMicroBenchmark
public int justCharUsed() {
int len = string.length(), sum = 0;
for (int i = 0; i < len; i++) sum += Character.valueOf(string.charAt(i));
return sum;
}
@GenerateMicroBenchmark
public void newChar() {
int len = string.length();
for (int i = 0; i < len; i++) new Character(string.charAt(i));
}
@GenerateMicroBenchmark
public void justChar() {
int len = string.length();
for (int i = 0; i < len; i++) Character.valueOf(string.charAt(i));
}
@GenerateMicroBenchmark
public void newCharValue() {
int len = string.length();
for (int i = 0; i < len; i++) new Character(string.charAt(i)).charValue();
}
@GenerateMicroBenchmark
public void justCharValue() {
int len = string.length();
for (int i = 0; i < len; i++) Character.valueOf(string.charAt(i)).charValue();
}
Run Code Online (Sandbox Code Playgroud)
justChar和newChar;...Value方法将charValue调用添加到基本版本;...Used方法添加charValue调用(隐式)并使用该值来排除任何死代码消除.Benchmark Mode Thr Cnt Sec Mean Mean error Units
o.s.Chars.justChar avgt 1 3 1 246.847 5.969 usec/op
o.s.Chars.justCharUsed avgt 1 3 1 370.031 26.057 usec/op
o.s.Chars.justCharValue avgt 1 3 1 296.342 60.705 usec/op
o.s.Chars.newChar avgt 1 3 1 123.302 10.596 usec/op
o.s.Chars.newCharUsed avgt 1 3 1 172.721 9.055 usec/op
o.s.Chars.newCharValue avgt 1 3 1 123.040 5.095 usec/op
Run Code Online (Sandbox Code Playgroud)
justChar和newChar变体,但它只是部分;newChar变体,添加charValue没有效果所以显然它是DCE'd;justChar,charValue确实有效果,所以似乎没有消除;newCharUsed和之间的稳定差异justCharUsed.| 归档时间: |
|
| 查看次数: |
704 次 |
| 最近记录: |