无法将Spark应用程序提交到群集,卡在"UNDEFINED"上

Question

无法将Spark应用程序提交到群集,卡在"UNDEFINED"上

我用这个命令将火花应用推向纱线集群

export YARN_CONF_DIR=conf
bin/spark-submit --class "Mining"
  --master yarn-cluster
  --executor-memory 512m ./target/scala-2.10/mining-assembly-0.1.jar

Run Code Online (Sandbox Code Playgroud)

在Web UI中,它坚持下去 UNDEFINED

在此输入图像描述

在控制台,它坚持

<code>14/11/12 16:37:55 INFO yarn.Client: Application report from ASM: 
     application identifier: application_1415704754709_0017
     appId: 17
     clientToAMToken: null
     appDiagnostics: 
     appMasterHost: example.com
     appQueue: default
     appMasterRpcPort: 0
     appStartTime: 1415784586000
     yarnAppState: RUNNING
     distributedFinalState: UNDEFINED
     appTrackingUrl: http://example.com:8088/proxy/application_1415704754709_0017/
     appUser: rain
</code>

Run Code Online (Sandbox Code Playgroud)

更新:

深入了解Logs for containerWeb UI http://example.com:8042/node/containerlogs/container_1415704754709_0017_01_000001/rain/stderr/?start=0,我发现了这一点

14/11/12 02:11:47 WARN YarnClusterScheduler: Initial job has not accepted 
any resources; check your cluster UI to ensure that workers are registered
and have sufficient memory
14/11/12 02:11:47 DEBUG Client: IPC Client (1211012646) connection to
spark.mvs.vn/192.168.64.142:8030 from rain sending #24418
14/11/12 02:11:47 DEBUG Client: IPC Client (1211012646) connection to
spark.mvs.vn/192.168.64.142:8030 from rain got value #24418

Run Code Online (Sandbox Code Playgroud)

我发现这个问题有解决方法http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/

The Hadoop cluster must have sufficient memory for the request.

For example, submitting the following job with 1GB memory allocated for
executor and Spark driver fails with the above error in the HDP 2.1 Sandbox.
Reduce the memory asked for the executor and the Spark driver to 512m and
re-start the cluster.

Run Code Online (Sandbox Code Playgroud)

我正在尝试这个解决方案,希望它会起作用.

Answer 1

Vu *_*Anh 4

解决方案

最后我发现it caused by memory problem

当我在界面的Web UI中更改yarn.nodemanager.resource.memory-mb为3072（其值为2048）并重新启动集群时，它起作用了。

在此输入图像描述

我很高兴看到这个

在此输入图像描述

纱线节点管理器中有 3GB，我的顶峰是

bin/spark-submit
    --class "Mining"
    --master yarn-cluster
    --executor-memory 512m
    --driver-memory 512m
    --num-executors 2
    --executor-cores 1
    ./target/scala-2.10/mining-assembly-0.1.jar`

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，3 月前
查看次数：	3655 次
最近记录：	11 年前