小编Adi*_*kar的帖子

Spark java.io.EOFException:过早的EOF:没有可用的长度前缀

我正在尝试阅读镶木地板文件并对其执行一些操作,并将结果保存为HDFS上的镶木地板.我是用Spark做的.在这样做的同时,我得到了以下异常.

java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2203)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:867)
Run Code Online (Sandbox Code Playgroud)

关于可能的原因和解决方案的任何帮助.

使用CDH 5.4.1

hadoop hdfs cloudera apache-spark

10
推荐指数
1
解决办法
3684
查看次数

Spark异常:写入行时任务失败

我正在阅读文本文件并将其转换为镶木地板文件.我正在使用火花代码.但是,当我尝试运行代码时,我得到以下异常

org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 1.0 failed 4 times, most recent failure: Lost task 2.3 in stage 1.0 (TID 9, XXXX.XXX.XXX.local): org.apache.spark.SparkException: Task failed while writing rows.
    at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:191)
    at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
    at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation$$anonfun$insert$1.apply(commands.scala:160)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArithmeticException: / by zero
    at parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:101)
    at parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:94)
    at parquet.hadoop.ParquetRecordWriter.<init>(ParquetRecordWriter.java:64)
    at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282)
    at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252)
    at org.apache.spark.sql.parquet.ParquetOutputWriter.<init>(newParquet.scala:83)
    at org.apache.spark.sql.parquet.ParquetRelation2$$anon$4.newInstance(newParquet.scala:229)
    at org.apache.spark.sql.sources.DefaultWriterContainer.initWriters(commands.scala:470)
    at org.apache.spark.sql.sources.BaseWriterContainer.executorSideSetup(commands.scala:360)
    at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.org$apache$spark$sql$sources$InsertIntoHadoopFsRelation$$writeRows$1(commands.scala:172)
    ... 8 …
Run Code Online (Sandbox Code Playgroud)

java hadoop apache-spark parquet apache-spark-sql

6
推荐指数
2
解决办法
2万
查看次数

SparkStreaming - ExitCodeException exitCode = 13

我正在使用spark-submit在yarn-cluster上运行我的spark流应用程序.当我在本地模式下运行它时工作正常.但是当我尝试使用spark-submit在yarn-cluster上运行它时,它会运行一段时间然后以下面的execption退出.

Diagnostics: Exception from container-launch.
Container id: container_1435576266959_1208_02_000002
Exit code: 13
Stack trace: ExitCodeException exitCode=13:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Run Code Online (Sandbox Code Playgroud)

任何帮助将不胜感激.

java apache-spark spark-streaming

5
推荐指数
1
解决办法
3656
查看次数