我的代码运行在EMR,Spark版本2.0.2上.它适用于较小的文件,但经常崩溃大于15GB的文件.崩溃发生在unpersist函数中,顺便说一下,这是处理的最后一步.
任何想法都会非常有帮助.谢谢!
17/05/06 23:46:01 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from /10.0.2.149:56200 is closed
17/05/06 23:46:01 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
17/05/06 23:46:01 INFO DAGScheduler: Executor lost: 7 (epoch 5)
17/05/06 23:46:01 INFO BlockManagerMasterEndpoint: Trying to remove executor 7 from BlockManagerMaster.
17/05/06 23:46:01 WARN BlockManagerMaster: Failed to remove RDD 43 - Connection reset by peer
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) …Run Code Online (Sandbox Code Playgroud)