我正在尝试使用scala将文件写入hdfs并且我不断收到以下错误
Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.hdfs.DFSClient.createNamenode(DFSClient.java:183)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:281)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at bcomposes.twitter.Util$.<init>(TwitterStream.scala:39)
at bcomposes.twitter.Util$.<clinit>(TwitterStream.scala)
at bcomposes.twitter.StatusStreamer$.main(TwitterStream.scala:17)
at bcomposes.twitter.StatusStreamer.main(TwitterStream.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
Run Code Online (Sandbox Code Playgroud)
我按照本教程安装了hadoop .下面的代码是我用来将示例文件插入hdfs的代码.
val configuration = new Configuration();
val hdfs = FileSystem.get( new URI( "hdfs://192.168.11.153:54310" ), configuration );
val file = new Path("hdfs://192.168.11.153:54310/s2013/batch/table.html");
if ( hdfs.exists( file )) { hdfs.delete( file, true ); }
val os = hdfs.create( file);
val br = new BufferedWriter( new OutputStreamWriter( os, "UTF-8" ) );
br.write("Hello World");
br.close();
hdfs.close();
Run Code Online (Sandbox Code Playgroud)
Hadoop版本是2.4.0,我使用的hadoop库版本是1.2.1.我应该做些什么改变才能做到这一点?
我在使用Hadoop 2.3时遇到了同样的问题,我已经解决了它在build.sbt文件中添加以下行:
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.3.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.3.0"
Run Code Online (Sandbox Code Playgroud)
所以我认为在你的情况下你使用的是2.4.0版本.
PS:它也适用于您的代码示例.我希望它会有所帮助
hadoop和spark versions 应该同步。(就我而言,我正在使用spark-1.2.0和hadoop 2.2.0)
第 1 步- 转到$SPARK_HOME
第 2 步- 只需使用您想要的客户端mvn build 版本hadoop即可,
mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package
Run Code Online (Sandbox Code Playgroud)
步骤 3 - Spark 项目也应该有正确的 Spark 版本,
name := "smartad-spark-songplaycount"
version := "1.0"
scalaVersion := "2.10.4"
//libraryDependencies += "org.apache.spark" %% "spark-core" % "1.1.1"
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.2.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.2.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.2.0"
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
Run Code Online (Sandbox Code Playgroud)