mnm*_*mnm 8 java hadoop mapreduce bigtop
我试图将外部罐子设置为hadoop classpath但到目前为止没有运气.
我有以下设置
$ hadoop版本
Hadoop 2.0.6-alpha Subversion https://git-wip-us.apache.org/repos/asf/bigtop.git -r ca4c88898f95aaab3fd85b5e9c194ffd647c2109由jenkins编译于2013-10-31T07:55Z来源于校验和95e88b2a9589fa69d6d5c1dbd48d4e此命令使用/usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar运行
类路径
$ echo $ HADOOP_CLASSPATH
/home/tom/workspace/libs/opencsv-2.3.jar
我能看到上面的HADOOP_CLASSPATH已经被hadoop选中了
$ Hadoop的类路径中
的/ etc/Hadoop的/ conf目录:/ usr/lib中/ Hadoop的/ lib目录/ :/ usr/lib中/ Hadoop的/.//:/home/tom/workspace/libs/opencsv-2.3.jar:/ usr/lib目录/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/:/ usr/lib/hadoop-hdfs /.//:/ usr/lib/hadoop-yarn/lib/:/ usr/lib/hadoop-yarn /.//:/ usr/lib/hadoop-mapreduce/lib/:/ usr/lib/hadoop-mapreduce /.//
命令
$ sudo hadoop jar FlightsByCarrier.jar FlightsByCarrier /user/root/1987.csv/user/root/result
我也尝试使用-libjars选项
$ sudo hadoop jar FlightsByCarrier.jar FlightsByCarrier /user/root/1987.csv/user/root/result -libjars /home/tom/workspace/libs/opencsv-2.3.jar
堆栈跟踪
14/11/04 16时43分23秒INFO mapreduce.Job:正在运行的作业:job_1415115532989_0001 14/11/04 16时43分55秒INFO mapreduce.Job:工作job_1415115532989_0001在超级模式下运行:假14/11/04 16:43 :56 INFO mapreduce.Job:地图0%减少0%14/11/04 16:45:27 INFO mapreduce.Job:地图50%减少0%14/11/04 16:45:27 INFO mapreduce.Job:任务Id:attempt_1415115532989_0001_m_000001_0,状态:FAILED错误:java.lang.ClassNotFoundException:au.com.bytecode.opencsv.CSVParser 在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:366)在java.net.URLClassLoader的$ 1.run(URLClassLoader.java:355)在java.security.AccessController.doPrivileged(本机方法)在java.net上. URLClassLoader.findClass(URLClassLoader.java:354)在java.lang.ClassLoader.loadClass(ClassLoader.java:425)在sun.misc.Launcher $ AppClassLoader.loadClass(Launcher.java:308)在java.lang.ClassLoader.loadClass (ClassLoader.java:358)在FlightsByCarrierMapper.map(FlightsByCarrierMapper.java:19)在FlightsByCarrierMapper.map(FlightsByCarrierMapper.java:10)在org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)在有机.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)在org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)在org.apache.hadoop.mapred.YarnChild $ 2.run( YarnChild.java:158)在java.security.AccessController.doPrivileged(本机方法)在javax.security.auth.Subject.doAs(Subject.java:415)在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation. java的:1 478)在org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
任何帮助都非常感谢.
运行地图的节点上缺少您的外部 jar。您必须将其添加到缓存中才能使其可用。尝试 :
DistributedCache.addFileToClassPath(new Path("pathToJar"), conf);
Run Code Online (Sandbox Code Playgroud)
不确定哪个版本DistributedCache已弃用,但从 Hadoop 2.2.0 开始,您可以使用:
job.addFileToClassPath(new Path("pathToJar"));
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
31786 次 |
| 最近记录: |