xen*_*ide 0 java performance java-8 java-stream
这些测试的时序差别很大,但是经历了相同的实现.我想了解为什么时间不同.
private static final int ITERATIONS = 100;
private static final DataFactory RANDOM_DF = DataFactoryImpl.defaultInstance();
@Test // 6s
public void testGetMaxLength() throws Exception {
for ( int i = 1; i < ITERATIONS; i++ ) {
testGetMaxLength( i );
}
}
private void testGetMaxLength( final int length ) {
for ( int i = 0; i < ITERATIONS; i++ ) {
String word = RANDOM_DF.word().getMaxLength( length );
assertThat( word, not( isEmptyOrNullString() ) );
assertThat( word.length(), allOf( greaterThanOrEqualTo( 1 ), lessThanOrEqualTo( length ) ) );
}
}
@Test // 301ms
public void testGetLength() throws Exception {
for ( int i = 1; i < ITERATIONS; i++ ) {
testGetLength( i );
}
}
private void testGetLength( final int length ) {
for ( int i = 0; i < ITERATIONS; i++ ) {
String word = RANDOM_DF.word().getLength( length );
assertThat( word, not( isEmptyOrNullString() ) );
assertThat( word.length(), equalTo( length ) );
Run Code Online (Sandbox Code Playgroud)
这个类DataFactoryUtil
很可能包含导致巨大差异的代码.
final class DataFactoryUtil {
private DataFactoryUtil() {
}
static <T> Optional<T> valueFromMap(
final Map<Integer, List<T>> map,
final IntUnaryOperator randomSupplier,
final int minInclusive,
final int maxInclusive
) {
List<T> list = map.entrySet()
.parallelStream() // line 26
.filter( e -> e.getKey() >= minInclusive && e.getKey() <= maxInclusive )
.map( Map.Entry::getValue )
.flatMap( Collection::stream )
.collect( Collectors.toList() );
return valueFromList( list, randomSupplier );
}
static <T> Optional<T> valueFromList( final List<T> list, final IntUnaryOperator randomSupplier ) {
int random = randomSupplier.applyAsInt( list.size() );
return list.isEmpty() ? Optional.empty() : Optional.of( list.get( random ) );
}
static List<String> dict() {
try {
URL url = DataFactoryUtil.class.getClassLoader().getResource( "dictionary" );
assert url != null;
return Files.lines( Paths.get( url.toURI() ) ).collect( Collectors.toList() );
}
catch ( URISyntaxException | IOException e ) {
throw new IllegalStateException( e );
}
}
}
Run Code Online (Sandbox Code Playgroud)
这是不同的实现
@FunctionalInterface
public interface RandomStringFactory {
default String getMaxLength( final int maxInclusive ) {
return this.getRange( 1, maxInclusive );
}
String getRange( final int minInclusive, final int maxInclusive );
default String getLength( int length ) {
return this.getRange( length, length );
}
}
Run Code Online (Sandbox Code Playgroud)
和实际执行 word
DataFactoryImpl( final IntBinaryOperator randomSource, final List<String> wordSource ) {
this.random = randomSource;
this.wordSource = wordSource.stream().collect( Collectors.groupingBy( String::length ) );
}
public static DataFactory defaultInstance() {
return new DataFactoryImpl( RandomUtils::nextInt, dict() );
}
default RandomStringFactory word() {
return ( min, max ) -> valueFromMap( getWordSource(), ( size ) -> getRandom().applyAsInt( 0, size ), min, max )
.orElse( alphabetic().getRange( min, max ) );
}
Run Code Online (Sandbox Code Playgroud)
为什么这两种方法的测量在共享实现时如此不同?有什么方法可以改善最坏的情况getMaxLength
吗?
更新
虽然我喜欢随机的理论来源,也许这是真的.将我的代码更改为这会导致13s
运行,这比运行时间长,这比运行时间的两倍多RandomUtils::nextInt
.
public static DataFactory defaultInstance() {
return new DataFactoryImpl( (a, b) -> a == b ? a : ThreadLocalRandom.current().nextInt(a, b), dict() );
}
Run Code Online (Sandbox Code Playgroud)
实际上,差异在于RandomUtils.nextInt()
您用于生成随机数的实现.如果if startInclusive
和endInclusive
parameters匹配(比如in getLength()
),它只返回非常快的参数.否则它请求java.util.Random
对象的静态实例来获取随机数.这java.util.Random
是线程安全的,但是存在非常严重的争用问题:你不能独立地从不同的线程中请求随机数:它们会在CAS循环中挨饿.正如您在使用.parallelStream()
中一样valueFromMap
,您遇到了这些问题.
这里最简单的解决方法是使用ThreadLocalRandom
:
new DataFactoryImpl( (a, b) -> ThreadLocalRandom.current().nextInt(a, b+1), dict() );
Run Code Online (Sandbox Code Playgroud)
请注意,ThreadLocalRandom.nextInt()
没有快速路径RandomUtils.nextInt()
,所以如果你想保留它,请使用:
new DataFactoryImpl(
(a, b) -> a == b ? a : ThreadLocalRandom.current().nextInt(a, b+1), dict() );
Run Code Online (Sandbox Code Playgroud)
注意不要在ThreadLocalRandom.current()
外面某处缓存实例(比如在字段或静态变量中):此调用必须在实际请求随机数的同一线程中执行.
归档时间: |
|
查看次数: |
161 次 |
最近记录: |