nic*_*ola 9 hadoop-yarn spark-streaming
我正在集群模式下在YARN上运行Spark Streaming应用程序,并且我正在尝试实现正常关闭,以便在应用程序被终止时它将在停止之前完成当前微批处理的执行.
以下一些教程,我已经配置spark.streaming.stopGracefullyOnShutdown到true与我添加以下代码,以我的应用程序:
sys.ShutdownHookThread {
log.info("Gracefully stopping Spark Streaming Application")
ssc.stop(true, true)
log.info("Application stopped")
}
Run Code Online (Sandbox Code Playgroud)
但是,当我杀死应用程序时
yarn application -kill application_1454432703118_3558
那一刻执行的微批量没有完成.
在驱动程序中,我看到第一行日志打印("正常停止Spark Streaming应用程序"),但不是最后一行("应用已停止").
ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
INFO streaming.MySparkJob: Gracefully stopping Spark Streaming Application
INFO scheduler.JobGenerator: Stopping JobGenerator gracefully
INFO scheduler.JobGenerator: Waiting for all received blocks to be consumed for job generation
INFO scheduler.JobGenerator: Waited for all received blocks to be consumed for job generation
INFO streaming.StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook
Run Code Online (Sandbox Code Playgroud)
在执行程序日志中,我看到以下错误:
ERROR executor.CoarseGrainedExecutorBackend: Driver 192.168.6.21:49767 disassociated! Shutting down.
INFO storage.DiskBlockManager: Shutdown hook called
WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@192.168.6.21:49767] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
INFO util.ShutdownHookManager: Shutdown hook called
Run Code Online (Sandbox Code Playgroud)
我认为问题与YARN如何向应用程序发送kill信号有关.关于如何让应用程序优雅停止的任何想法?
您应该转到执行程序页面以查看您的驱动程序正在哪里运行(在哪个节点上)。ssh 到该节点并执行以下操作:
ps -ef | grep 'app_name'
Run Code Online (Sandbox Code Playgroud)
(将 app_name 替换为您的类名/应用程序名)。它将列出几个进程。看看这个过程,有些将是另一个的子进程。选择最父进程的 id 并发送 SIGTERM
kill pid
Run Code Online (Sandbox Code Playgroud)
一段时间后,您会看到您的应用程序已正常终止。
另外,现在您不需要添加那些用于关闭的挂钩。使用spark.streaming.stopGracefullyOnShutdown配置来帮助正常关闭
| 归档时间: |
|
| 查看次数: |
2278 次 |
| 最近记录: |