Java System.arraycopy()
对于小型数组是否有效,或者它是一种本机方法使得它可能比简单的循环和函数调用效率低得多?
本机方法是否会因跨越某种Java系统桥而产生额外的性能开销?
van*_*nza 27
在Sid编写的内容上稍微扩展一下,很可能System.arraycopy
只是一个JIT内在函数; 这意味着当代码调用时System.arraycopy
,它很可能是调用JIT特定的实现(一旦JIT标记System.arraycopy
为"热"),而不是通过JNI接口执行,因此它不会产生本机方法的正常开销.
通常,执行本机方法确实会产生一些开销(通过JNI接口,在执行本机方法时也不会发生某些内部JVM操作).但这不是因为一个方法被标记为"本机",你实际上是在使用JNI执行它.JIT可以做一些疯狂的事情.
最简单的检查方法是,正如已经建议的那样,编写一个小基准测试,小心Java微基准测试的常规警告(首先预热代码,避免代码没有副作用,因为JIT只是将其优化为无操作等).
小智 24
这是我的基准代码:
public void test(int copySize, int copyCount, int testRep) {
System.out.println("Copy size = " + copySize);
System.out.println("Copy count = " + copyCount);
System.out.println();
for (int i = testRep; i > 0; --i) {
copy(copySize, copyCount);
loop(copySize, copyCount);
}
System.out.println();
}
public void copy(int copySize, int copyCount) {
int[] src = newSrc(copySize + 1);
int[] dst = new int[copySize + 1];
long begin = System.nanoTime();
for (int count = copyCount; count > 0; --count) {
System.arraycopy(src, 1, dst, 0, copySize);
dst[copySize] = src[copySize] + 1;
System.arraycopy(dst, 0, src, 0, copySize);
src[copySize] = dst[copySize];
}
long end = System.nanoTime();
System.out.println("Arraycopy: " + (end - begin) / 1e9 + " s");
}
public void loop(int copySize, int copyCount) {
int[] src = newSrc(copySize + 1);
int[] dst = new int[copySize + 1];
long begin = System.nanoTime();
for (int count = copyCount; count > 0; --count) {
for (int i = copySize - 1; i >= 0; --i) {
dst[i] = src[i + 1];
}
dst[copySize] = src[copySize] + 1;
for (int i = copySize - 1; i >= 0; --i) {
src[i] = dst[i];
}
src[copySize] = dst[copySize];
}
long end = System.nanoTime();
System.out.println("Man. loop: " + (end - begin) / 1e9 + " s");
}
public int[] newSrc(int arraySize) {
int[] src = new int[arraySize];
for (int i = arraySize - 1; i >= 0; --i) {
src[i] = i;
}
return src;
}
Run Code Online (Sandbox Code Playgroud)
根据我的测试,test()
使用copyCount
= 10000000(1e7)或更大的呼叫允许在第一次copy/loop
呼叫期间实现预热,因此使用testRep
= 5就足够了; 对于copyCount
= 1000000(1e6),预热需要至少2或3次迭代,因此testRep
应增加以获得可用的结果.
通过我的配置(CPU Intel Core 2 Duo E8500 @ 3.16GHz,Java SE 1.6.0_35-b10和Eclipse 3.7.2),从基准测试中可以看出:
copySize
= 24时,System.arraycopy()
手动循环几乎占用相同的时间(有时一个比另一个快一点,有时则相反),copySize
<24时,手动循环快于System.arraycopy()
(稍快于copySize
= 23,非常快,copySize
<5),copySize
> 24时,System.arraycopy()
比手动循环更快(稍快于copySize
= 25,比率循环时间/阵列复制时间随着增加而copySize
增加).注意:我不是英语母语,请原谅我的所有语法/词汇错误.
irr*_*ble 17
这是一个有效的问题.例如,java.nio.DirectByteBuffer.put(byte[])
作者试图避免少量元素的JNI副本
// These numbers represent the point at which we have empirically
// determined that the average cost of a JNI call exceeds the expense
// of an element by element copy. These numbers may change over time.
static final int JNI_COPY_TO_ARRAY_THRESHOLD = 6;
static final int JNI_COPY_FROM_ARRAY_THRESHOLD = 6;
Run Code Online (Sandbox Code Playgroud)
因为System.arraycopy()
,我们可以检查JDK如何使用它.例如,总是使用in ArrayList
,System.arraycopy()
永远不会"逐个元素复制",无论长度如何(即使它是0).由于ArrayList
非常注重性能,我们可以推导出System.arraycopy()
无论长度如何,这是最有效的数组复制方式.
Instead of relying on speculation and possibly outdated information, I ran some benchmarks using caliper. In fact, Caliper comes with some examples, including a CopyArrayBenchmark
that measures exactly this question! All you have to do is run
mvn exec:java -Dexec.mainClass=com.google.caliper.runner.CaliperMain -Dexec.args=examples.CopyArrayBenchmark
Run Code Online (Sandbox Code Playgroud)
My results are based on Oracle's Java HotSpot(TM) 64-Bit Server VM, 1.8.0_31-b13, running on a mid-2010 MacBook Pro (macOS 10.11.6 with an Intel Arrandale i7, 8 GiB RAM). I don't believe that it's useful to post the raw timing data. Rather, I'll summarize the conclusions with the supporting visualizations.
In summary:
for
loop to copy each element into a newly instantiated array is never advantageous, even for arrays as short as 5 elements.Arrays.copyOf(array, array.length)
and array.clone()
are both consistently fast. These two techniques are nearly identical in performance; which one you choose is a matter of taste.System.arraycopy(src, 0, dest, 0, src.length)
is almost as fast as Arrays.copyOf(array, array.length)
and array.clone()
, but not quite consistently so. (See the case for 50000 int
s.) Because of that, and the verbosity of the call, I would recommend System.arraycopy()
if you need fine control over which elements get copied where.Here are the timing plots:
字节代码无论如何都是本地执行的,因此性能可能比循环更好.
因此,在循环的情况下,它必须执行将导致开销的字节代码.阵列副本应该是直接记忆.