为什么Scala"for loop comprehensions"与FOR循环相比非常慢？

Question

为什么Scala"for loop comprehensions"与FOR循环相比非常慢？

Mar*_*tos 8 performance for-loop scala

有人说Scala For comprehension实际上很慢.给出我的原因是由于Java的限制,对于理解(例如"reduce",下面使用)需要在每次迭代时生成一个临时对象,以便调用传入的函数.

这是真的？下面的测试似乎证实了这一点,但我不完全理解为什么会这样.

这可能对"lambdas"或匿名函数有意义,但对于非匿名函数则无效.

在我的测试中,我针对list.reduce运行了循环(参见下面的代码),发现它们的速度超过了两倍,即使每次迭代都调用了传递给reduce的完全相同的函数!

我发现这非常反直觉(曾经认为Scala库会被仔细创建为尽可能最佳).

在我放在一起的测试中,我运行了相同的循环(总结数字1到100万,无论溢出)五种不同的方式:

for循环遍历值数组
for循环,但调用函数而不是内联算术
for循环,创建一个包含附加功能的对象
list.reduce,传递一个匿名函数
list.reduce,传入一个对象成员函数

结果如下:测试:最小/最大/平均(毫秒)

1. 27/157/64.78
2. 27/192/65.77 <--- note the similarity between tests 1,2 and 4,5
3. 139/313/202.58
4. 63/342/150.18
5. 63/341/149.99

Run Code Online (Sandbox Code Playgroud)

可以看出,"for comprehension"版本的顺序为"for for new for each instance",暗示实际上可以为匿名和非匿名函数版本执行"new".

方法:下面的代码(删除测试调用)被编译成单个.jar文件,以确保所有版本都运行相同的库代码.每次迭代中的每个测试都在一个新的JVM中调用(即scala -cp ...用于每个测试),以便消除堆大小问题.

class t(val i: Int) {
    def summit(j: Int) = j + i
}

object bar {
    val biglist:List[Int]  =  (1 to 1000000).toList

    def summit(i: Int, j:Int) = i+j

    // Simple for loop
    def forloop:  Int = {
        var result: Int = 0
        for(i <- biglist) {
            result += i
        }
        result
    }

    // For loop with a function instead of inline math
    def forloop2:  Int = {
        var result: Int = 0
        for(i <- biglist) {
            result = summit(result,i)
        }
        result
    }

    // for loop with a generated object PER iteration
    def forloop3: Int = {
        var result: Int = 0
        for(i <- biglist) {
            val t = new t(result)
            result = t.summit(i)
        }
        result
    }

    // list.reduce with an anonymous function passed in
    def anonymousfunc: Int = {
        biglist.reduce((i,j) => {i+j})
    }

    // list.reduce with a named function
    def realfunc: Int = {
        biglist.reduce(summit)
    }

    // test calling code excised for brevity. One example given:
    args(0) match {
        case "1" => {
                    val start = System.currentTimeMillis()
                    forloop
                    val end = System.currentTimeMillis()
                    println("for="+(end - start))
                    }
         ...
}

Run Code Online (Sandbox Code Playgroud)

Answer 1

Nik*_*kov 16

你被告知的是关于"理解"的真实情况,但问题的问题在于你将"理解"与"匿名函数"混为一谈.

"对于理解"在Scala是一系列的语法糖.flatMap,.map和.filter应用程序.既然你正在测试还原算法和,因为它是不可能通过这三个函数来实现减少算法,测试用例是不正确的.

这是一个"理解"的例子:

val listOfLists = List(List(1,2), List(3,4), List(5))
val result = 
  for {
    itemOfListOfLists <- listOfLists
    itemOfItemOfListOfLists <- itemOfListOfLists
  }
  yield (itemOfItemOfListOfLists + 1)
assert( result == List(2,3,4,5,6) )

Run Code Online (Sandbox Code Playgroud)

编译器将理解部分去掉以下内容:

val result =
  listOfLists.flatMap(
    itemOfListOfLists => itemOfListOfLists.map(
      itemOfItemOfListOfLists => itemOfItemOfListOfLists + 1
    )
  )

Run Code Online (Sandbox Code Playgroud)

然后它消除了匿名函数语法:

val result =
  listOfLists.flatMap(
    new Function1[List[Int], List[Int]] {
      override def apply(itemOfListOfLists: List[Int]): List[Int] =
        itemOfListOfLists.map(
          new Function1[Int, Int] {
            override def apply(itemOfItemOfListOfLists: Int): Int =
              itemOfItemOfListOfLists + 1
          }
        )
    }
  )

Run Code Online (Sandbox Code Playgroud)

从desugarred代码可以看出,Function1[Int, Int]每次apply(itemOfListOfLists: List[Int]): List[Int]调用该方法时,类都会被实例化.每次进入都会发生这种情况listOfLists.因此,您理解的越复杂,Function您获得的对象的实例化就越多.

归档时间：	12 年，6 月前
查看次数：	2531 次
最近记录：	9 年，1 月前