小编gon*_*iaz的帖子

EMR 上的 Spark Streaming Kinesis 抛出“将块存储到 Spark 中时出错”

我们有一个在 EMR (5.16) 上运行的 Spark (2.3.1) Streaming 应用程序,它使用过去 2 年的 AWS Kinesis 的单个分片。最近(过去 2 个月),我们在应用程序中出现随机错误。我们尝试更新到最新版本的Spark和EMR,错误没有解决。

问题
出乎意料,在任何时刻和正常运行时间,并且在 RAM、CPU、网络中显然没有指标的情况下,接收器停止向 Spark 发送消息,我们在日志中出现此错误:

18/09/11 18:20:00 ERROR ReceiverTracker: Deregistered receiver for stream 0: Error while storing block into Spark - java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.streaming.receiver.WriteAheadLogBasedBlockHandler.storeBlock(ReceivedBlockHandler.scala:210)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushAndReportBlock(ReceiverSupervisorImpl.scala:158)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.pushArrayBuffer(ReceiverSupervisorImpl.scala:129)
at org.apache.spark.streaming.receiver.Receiver.store(Receiver.scala:133)
at org.apache.spark.streaming.kinesis.KinesisReceiver.org$apache$spark$streaming$kinesis$KinesisReceiver$$storeBlockWithRanges(KinesisReceiver.scala:306)
at org.apache.spark.streaming.kinesis.KinesisReceiver$GeneratedBlockHandler.onPushBlock(KinesisReceiver.scala:357)
at org.apache.spark.streaming.receiver.BlockGenerator.pushBlock(BlockGenerator.scala:297)
at org.apache.spark.streaming.receiver.BlockGenerator.org$apache$spark$streaming$receiver$BlockGenerator$$keepPushingBlocks(BlockGenerator.scala:269)
at org.apache.spark.streaming.receiver.BlockGenerator$$anon$1.run(BlockGenerator.scala:110)
Run Code Online (Sandbox Code Playgroud)

Spark 应用程序继续正常工作,但没有要处理的记录。

我在互联网上发现的唯一相关的是这个 Github 问题https://github.com/awslabs/amazon-kinesis-client/issues/185

我试过
的配置的最终版本看起来像这样

[
  {
    "Classification": "capacity-scheduler",
    "Properties": …
Run Code Online (Sandbox Code Playgroud)

amazon-emr apache-spark amazon-kinesis

5
推荐指数
0
解决办法
521
查看次数

标签 统计

amazon-emr ×1

amazon-kinesis ×1

apache-spark ×1