bob*_*bob 6 hadoop-yarn apache-spark apache-spark-sql
我正在加入 1100 万条记录。我在 EMR Cluster Spark 2.2.1 中与 5 个工作人员一起运行
运行作业时出现以下错误:
executor 3): java.lang.IllegalArgumentException: Cannot allocate a page with more than 17179869176 bytes
at org.apache.spark.memory.TaskMemoryManager.allocatePage(TaskMemoryManager.java:277)
at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:90)
at org.apache.spark.shuffle.sort.ShuffleExternalSorter.growPointerArrayIfNecessary(ShuffleExternalSorter.java:328)
at org.apache.spark.shuffle.sort.ShuffleExternalSorter.insertRecord(ShuffleExternalSorter.java:379)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.insertRecordIntoSorter(UnsafeShuffleWriter.java:246)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:167)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Run Code Online (Sandbox Code Playgroud)
我无法理解可能的原因。请帮我设置什么参数。
目前我正在运行以下参数: --num-executors 5 --conf spark.eventLog.enabled=true --executor-memory 70g --driver-memory 30g --executor-cores 16 --conf spark.shuffle.memoryFraction=0.5
归档时间: |
|
查看次数: |
1718 次 |
最近记录: |