Apache Pig:无法运行我自己的pig.jar和pig-withouthadoop.jar

sun*_*nny 5 java hadoop mapreduce apache-pig

我有一个运行Hadoop 0.20.2和Pig 0.10的集群.我有兴趣在Pig的源代码中添加一些日志,并在集群上运行我自己的Pig版本.

我做了什么:

  1. 使用'ant'命令构建项目

  2. 得到了pig.jar和pig-withouthadoop.jar

  3. 将jar复制到集群namenode上的Pig主目录

  4. 找工作

然后我有以下标准输出:

2013-03-25 06:35:05,226 [main] WARN  org.apache.pig.backend.hadoop20.PigJobControl -   falling back to default JobControl (not using hadoop 0.20 ?)
java.lang.NoSuchFieldException: runnerState
    at java.lang.Class.getDeclaredField(Class.java:1882)
    at org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
    at  org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.newJobControl(HadoopShims.java:97)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:287)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177)
    at org.apache.pig.PigServer.launchPlan(PigServer.java:1320)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1305)
    at org.apache.pig.PigServer.execute(PigServer.java:1295)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:375)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:353)
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:480)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
2013-03-25 06:35:05,229 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2013-03-25 06:35:05,260 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2013-03-25 06:35:05,272 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2013-03-25 06:35:06,041 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job9091543475518322185.jar
2013-03-25 06:35:10,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job9091543475518322185.jar created
2013-03-25 06:35:10,995 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2013-03-25 06:35:11,006 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2013-03-25 06:35:11,006 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2013-03-25 06:35:11,006 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2013-03-25 06:35:11,181 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.mapred.jobcontrol.JobControl.addJob(Lorg/apache/hadoop/mapred/jobcontrol/Job;)Ljava/lang/String;
Run Code Online (Sandbox Code Playgroud)

猪堆痕迹:

ERROR 2998: Unhandled internal error. org.apache.hadoop.mapred.jobcontrol.JobControl.addJob(Lorg/apache/hadoop/mapred/jobcontrol/Job;)Ljava/lang/String;

java.lang.NoSuchMethodError: org.apache.hadoop.mapred.jobcontrol.JobControl.addJob(Lorg/apache/hadoop/mapred/jobcontrol/Job;)Ljava/lang/String;
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:298)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177)
    at org.apache.pig.PigServer.launchPlan(PigServer.java:1320)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1305)
    at org.apache.pig.PigServer.execute(PigServer.java:1295)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:375)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:353)
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:480)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Run Code Online (Sandbox Code Playgroud)

什么地方出了错?除了在namenode的安装目录中替换pig.jar和pig-withouthadoop.jar之外,我还应该做什么吗?

救命...

sun*_*nny 6

我错过的一点是:pig-withouthadoop.jar应该用特定的Hadoop版本编译.我按照以下方式编译了jar,它工作正常:

% ant clean jar-withouthadoop -Dhadoopversion=23
Run Code Online (Sandbox Code Playgroud)