小编gat*_*789的帖子

apache-spark:rdd.unpersist崩溃了大文件

我的代码运行在EMR,Spark版本2.0.2上.它适用于较小的文件,但经常崩溃大于15GB的文件.崩溃发生在unpersist函数中,顺便说一下,这是处理的最后一步.

任何想法都会非常有帮助.谢谢!

    17/05/06 23:46:01 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from /10.0.2.149:56200 is closed
    17/05/06 23:46:01 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
    17/05/06 23:46:01 INFO DAGScheduler: Executor lost: 7 (epoch 5)
    17/05/06 23:46:01 INFO BlockManagerMasterEndpoint: Trying to remove executor 7 from BlockManagerMaster.
    17/05/06 23:46:01 WARN BlockManagerMaster: Failed to remove RDD 43 - Connection reset by peer
    java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
        at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
        at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) …
Run Code Online (Sandbox Code Playgroud)

timeout large-files apache-spark

7
推荐指数
0
解决办法
494
查看次数

标签 统计

apache-spark ×1

large-files ×1

timeout ×1