rdd1.join(rdd2)如果rdd1并rdd2拥有相同的分区,会导致洗牌吗?
当我通过spark-submit和spark-sql执行查询sql时,相应的spark应用程序总是失败,错误如下:
15/03/10 18:50:52 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@slave75:60697/user/HeartbeatReceiver
15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave79:35643] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Run Code Online (Sandbox Code Playgroud)
以上只是错误之一,我使用"yarn logs -application application_1425944520319_8102.log"来获取整个应用程序日志并筛选出如下错误:
Line 46: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:55156] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 97: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:32852] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 149: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave09:45654] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting down.
Line 200: 15/03/10 18:52:08 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@slave10:45702] -> [akka.tcp://sparkDriver@slave75:60697] disassociated! Shutting …Run Code Online (Sandbox Code Playgroud) 当我执行者在纱线上引发流媒体应用时,我继续收到以下错误
为什么错误发生以及如何解决?任何建议都会有所帮助,谢谢〜
15/05/07 11:11:50 INFO dstream.StateDStream: Marking RDD 2364 for time 1430968310000 ms for checkpointing
15/05/07 11:11:50 INFO scheduler.JobScheduler: Added jobs for time 1430968310000 ms
15/05/07 11:11:50 INFO scheduler.JobGenerator: Checkpointing graph for time 1430968310000 ms
15/05/07 11:11:50 INFO streaming.DStreamGraph: Updating checkpoint data for time 1430968310000 ms
15/05/07 11:11:50 INFO streaming.DStreamGraph: Updated checkpoint data for time 1430968310000 ms
15/05/07 11:11:50 ERROR actor.OneForOneStrategy: org.apache.spark.streaming.StreamingContext
java.io.NotSerializableException: org.apache.spark.streaming.StreamingContext
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) …Run Code Online (Sandbox Code Playgroud)