我有一个Hadoop工作,其中映射程序必须使用外部jar。
我试图将此罐子传递给映射器的JVM
通过hadoop命令上的-libjars参数
hadoop jar mrrunner.jar DAGMRRunner -libjars <path_to_jar>/colt.jar
Run Code Online (Sandbox Code Playgroud)
通过job.addFileToClassPath
job.addFileToClassPath(new Path("<path_to_jar>/colt.jar"));
Run Code Online (Sandbox Code Playgroud)
在HADOOP_CLASSPATH上。
g1mihai@hydra:/home/g1mihai/$ echo $HADOOP_CLASSPATH
<path_to_jar>/colt.jar
Run Code Online (Sandbox Code Playgroud)
这些方法均无效。这是我得到的堆栈跟踪。它抱怨的缺少的类是colt.jar中的SparseDoubleMatrix1D。
让我知道是否应提供任何其他调试信息。谢谢。
15/02/14 16:47:51 INFO mapred.MapTask: Starting flush of map output
15/02/14 16:47:51 INFO mapred.LocalJobRunner: map task executor complete.
15/02/14 16:47:51 WARN mapred.LocalJobRunner: job_local368086771_0001
java.lang.Exception: java.lang.NoClassDefFoundError: Lcern/colt/matrix/impl/SparseDoubleMatrix1D;
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NoClassDefFoundError: Lcern/colt/matrix/impl/SparseDoubleMatrix1D;
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2499)
at java.lang.Class.getDeclaredField(Class.java:1951)
at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) …Run Code Online (Sandbox Code Playgroud)