use*_*093 12 c# mono hadoop-streaming
我有用C#编写的mapper和reducer可执行文件.我想在Hadoop流媒体中使用它们.
这是我用来创建Hadoop作业的命令......
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-*.jar
-input "/user/hduser/ss_waits"
-output "/user/hduser/ss_waits-output"
–mapper "mono mapper.exe"
–reducer "mono reducer.exe"
-file "mapper.exe"
-file "reducer.exe"
Run Code Online (Sandbox Code Playgroud)
这是每个映射器遇到的错误......
java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1014)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:592)
at org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper.java:38)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Run Code Online (Sandbox Code Playgroud)
基于调用堆栈,问题似乎是(Java)IdentityMapper类被用作映射器.(这解释了导致类型不匹配错误的原因).映射器应该是可执行文件"mono mapper.exe".
任何想法为什么没有使用mono mapper.exe?
mapper.exe和reducer.exe具有以下权限:-rwxr-xr-x
我能够从unix命令shell 成功执行mono mapper.exe并让它从stdin中读取文本并写入stdout.
环境:
假设 mono 在 PATH 中,您是否需要 mapper.exe 和 reducer.exe 的完整路径?IE
\n\nhadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-*.jar \n-input "/user/hduser/ss_waits" \n-output "/user/hduser/ss_waits-output" \n\xe2\x80\x93mapper "mono /path/to/mapper.exe" \n\xe2\x80\x93reducer "mono /path/to/reducer.exe" \n-file "mapper.exe" \n-file "reducer.exe"\nRun Code Online (Sandbox Code Playgroud)\n
| 归档时间: |
|
| 查看次数: |
1136 次 |
| 最近记录: |