请求帮助以了解此消息..
INFO spark.MapOutputTrackerMaster: Size of output statuses for shuffle 2 is **2202921** bytes
Run Code Online (Sandbox Code Playgroud)
2202921在这里意味着什么?
我的工作是一个随机操作,当从前一个阶段读取随机文件时,它首先给出消息,然后在一段时间之后失败并出现以下错误:
14/11/12 11:09:46 WARN scheduler.TaskSetManager: Lost task 224.0 in stage 4.0 (TID 13938, ip-xx-xxx-xxx-xx.ec2.internal): FetchFailed(BlockManagerId(11, ip-xx-xxx-xxx-xx.ec2.internal, 48073, 0), shuffleId=2, mapId=7468, reduceId=224)
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Marking Stage 4 (coalesce at <console>:49) as failed due to a fetch failure from Stage 3 (map at <console>:42)
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Stage 4 (coalesce at <console>:49) failed in 213.446 s
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Resubmitting Stage 3 (map at …Run Code Online (Sandbox Code Playgroud)