错误YarnClientSchedulerBackend:要求删除不存在的执行程序21

Question

错误YarnClientSchedulerBackend:要求删除不存在的执行程序21

我第一次跑的时候

lines = sc.textFile(os.path.join(folder_name),100)

Run Code Online (Sandbox Code Playgroud)

然后

parsed_lines=lines.map(lambda line: parse_line(line, ["udid"])).persist(StorageLevel.MEMORY_AND_DISK).groupByKey(1000).take(10)

Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

...
ERROR YarnClientSchedulerBackend: Asked to remove non-existent executor 21
...
WARN TaskSetManager: Lost task 0.1 in stage 11.7 (TID 1151, <machine name>): FetchFailed(null, shuffleId=0, mapId=-1, reduceId=896, message=
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0

Run Code Online (Sandbox Code Playgroud)

我尝试更改以下参数以及groupbykey中的拆分数和textFile函数中的分区数.

conf.set("spark.cores.max", "128")
conf.set("spark.akka.frameSize", "1024")
conf.set("spark.executor.memory", "6G")
conf.set("spark.shuffle.file.buffer.kb", "100")

Run Code Online (Sandbox Code Playgroud)

我不确定如何根据工人的能力,输入大小和我将应用的转换来决定这些参数.

Answer 1

rye*_*rye 0

我收到了同样的错误。我通过减少我在spark-defaults.conf中请求的执行器数量解决了这个问题。

说一下原来是：

spark.executor.instances 7

Run Code Online (Sandbox Code Playgroud)

我把它改为：

spark.executor.instances 4

Run Code Online (Sandbox Code Playgroud)

我没有更改任何其他内容，并且能够避免该错误。

归档时间：	10 年，6 月前
查看次数：	1291 次
最近记录：	9 年，7 月前