如何控制scala中future.sequence的并发?

k0p*_*kus 12 concurrency asynchronous scala future

我知道我可以将 aSeq[Future[T]]转换为Future[Seq[T]]via

  val seqFuture = Future.sequence(seqOfFutures)
  seqFuture.map((seqT: Seq[T]) => {...})
Run Code Online (Sandbox Code Playgroud)

我现在的问题是,我在这个序列中有 700 个期货,我希望能够控制其中有多少是并行解决的,因为每个期货都会调用内部休息 api,同时有 700 个请求就像是在发射针对该服务器的 dos 攻击。

我宁愿一次只解决 10 个期货。

我怎样才能做到这一点?


尝试pamu 的回答我看到错误:

[error] /home/philipp/src/bluebat/src/main/scala/com/dreamlines/metronome/service/JobFetcher.scala:32:44: com.dreamlines.commons.LazyFuture[A] does not take parameters
[error]         val batch = Future.sequence(c.map(_()))
[error]                                            ^
[error] /home/philipp/src/bluebat/src/main/scala/com/dreamlines/metronome/service/JobFetcher.scala:32:28: no type parameters for method sequence: (in: M[scala.concurrent.Future[A]])(implicit cbf: scala.collection.generic.CanBuildFrom[M[scala.concurrent.Future[A]],A,M[A]], implicit executor: scala.concurrent.ExecutionContext)scala.concurrent.Future[M[A]] exist so that it can be applied to arguments (List[Nothing])
[error]  --- because ---
[error] argument expression's type is not compatible with formal parameter type;
[error]  found   : List[Nothing]
[error]  required: ?M[scala.concurrent.Future[?A]]
[error]         val batch = Future.sequence(c.map(_()))
[error]                            ^
[error] /home/philipp/src/bluebat/src/main/scala/com/dreamlines/metronome/service/JobFetcher.scala:32:42: type mismatch;
[error]  found   : List[Nothing]
[error]  required: M[scala.concurrent.Future[A]]
[error]         val batch = Future.sequence(c.map(_()))
[error]                                          ^
[error] /home/philipp/src/bluebat/src/main/scala/com/dreamlines/metronome/service/JobFetcher.scala:32:36: Cannot construct a collection of type M[A] with elements of type A based on a collection of type M[scala.concurrent.Future[A]].
[error]         val batch = Future.sequence(c.map(_()))
[error]                                    ^
[error] four errors found
Run Code Online (Sandbox Code Playgroud)

pam*_*amu 7

向左折叠

SimplefoldLeft可用于控制一次同时运行的期货数量。

首先,让我们创建一个名为的案例类 LazyFuture

case class LazyFuture[+A](f: Unit => Future[A]) {
  def apply() = f()
}

object LazyFuture {
  def apply[A](f: => A)(implicit ec: ExecutionContext): LazyFuture[A] = LazyFuture(_ => Future(f))

  def apply[A](f: => Future[A])(implicit ec: ExecutionContext): LazyFuture[A] = LazyFuture(_ => f)
}
Run Code Online (Sandbox Code Playgroud)

LazyFuture 立即停止未来运行

val list: List[LazyFuture[A]] = ...


list.grouped(concurFactor).foldLeft(Future.successful(List.empty[A])){ (r, c) =>
  val batch = Future.sequence(c.map(_()))
  batch.flatMap(values => r.map(rs => rs ++ values))
}
Run Code Online (Sandbox Code Playgroud)

concurFactor相应地更改以同时运行多个期货。

concurFactor 的 1 将一次运行一个未来

concurFactor of 2 将同时运行两个期货

等等 ...

def executeBatch[A](list: List[LazyFuture[A]])(concurFactor: Int) =
   list.grouped(concurFactor).foldLeft(Future.successful(List.empty[A])){ (r, c) =>
      val batch = Future.sequence(c.map(_()))
      r.flatMap(rs => batch.map(values => rs ++ values))
    }
Run Code Online (Sandbox Code Playgroud)

完整代码

  case class LazyFuture[+A](f: Unit => Future[A]) {
    def apply() = f()
  }

  object LazyFuture {
    def apply[A](f: => A)(implicit ec: ExecutionContext): LazyFuture[A] = LazyFuture(_ => Future(f))

    def apply[A](f: => Future[A])(implicit ec: ExecutionContext): LazyFuture[A] = LazyFuture(_ => f)
  }

  def executeBatch[A](list: List[LazyFuture[A]])(concurFactor: Int)(implicit ec: ExecutionContext): Future[List[A]] =
    list.grouped(concurFactor).foldLeft(Future.successful(List.empty[A])) { (r, c) =>
      val batch = Future.sequence(c.map(_ ()))
      r.flatMap(rs => batch.map(values => rs ++ values))
    }
Run Code Online (Sandbox Code Playgroud)

限制执行上下文

您还可以通过限制执行池中的线程数来限制计算资源。但是,这个解决方案不是那么灵活。就个人而言,我不喜欢它。

val context: ExecutionContext = 
  ExecutionContext.fromExecutor(Executors.newFixedThreadPool(8))
Run Code Online (Sandbox Code Playgroud)

您必须记住传递正确的执行上下文,这是一个隐式值。有时我们不知道哪个隐式在范围内。这是马车

警告

当未来被构造如下

val foo = Future {
     1 + 2
} // future starts executing

LazyFuture(foo) // Not a right way
Run Code Online (Sandbox Code Playgroud)

foo 已经开始执行,无法控制。

正确的构造方法 LazyFuture

val foo = LazyFuture {
  1 + 2
}
Run Code Online (Sandbox Code Playgroud)

或者

val foo = LazyFuture {
  Future {
   1 + 2
  }
}
Run Code Online (Sandbox Code Playgroud)

工作示例

package main

import scala.concurrent.{Await, ExecutionContext, Future}
import scala.concurrent.duration.Duration

object Main {

  case class LazyFuture[A](f: Unit => Future[A]) {
    def apply(): Future[A] = f()
  }

  object LazyFuture {
    def apply[A](f: => A)(implicit ec: ExecutionContext): LazyFuture[A] = LazyFuture(_ => Future(f))
    def apply[A](f: => Future[A]): LazyFuture[A] = LazyFuture(_ => f)
  }

  def executeBatch[A](list: List[LazyFuture[A]])(concurFactor: Int)
    (implicit ec: ExecutionContext): Future[List[A]] =
    list.grouped(concurFactor).foldLeft(Future.successful(List.empty[A])) { (r, c) =>
      val batch = Future.sequence(c.map(_ ()))
      r.flatMap(rs => r.map(values=> rs ++ values))
    }

  def main(args: Array[String]): Unit = {
    import scala.concurrent.ExecutionContext.Implicits.global


    val futures: Seq[LazyFuture[Int]] = List(1, 2, 3, 4, 5).map { value =>
      LazyFuture {
        println(s"value: $value started")
        Thread.sleep(value * 200)
        println(s"value: $value stopped")
        value
      }
    }
    val f = executeBatch(futures.toList)(2)
    Await.result(f, Duration.Inf)
  }

}
Run Code Online (Sandbox Code Playgroud)

  • @pamu,您的解决方案在给定时间并不完全运行“n”期货。当您开始处理该组时,它会完全利用池。但在几个任务准备好后,它不会从下一组中获取任务。组中的最后一个任务将单独运行。 (2认同)