EMR - 从S3运行Pig Script的问题

Aja*_*jay 0 hadoop amazon-s3 apache-pig amazon-emr

我尝试在EMR上运行Pig脚本,如:

pig -f s3://bucket-name/loadData.pig

但它失败了,错误:

错误2999:意外的内部错误.空值

org.apache.pmpl.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:778)中的org.apache.pig.impl.io.FileLocalizer.fetchFiles(FileLocalizer.java:746)中的java.lang.NullPointerException. apache.pig.PigServer.registerJar(PigServer.java:458)org.apache.pig.tools.grunt.GruntParser.processRegister(GruntParser.java:433)atg.apache.pig.tools.pigscript.parser.PigScriptParser.解析(PigScriptParser.java:445)org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)org.apache.pig.Main.run(Main.java:479)org.apache.pig.Main.main(Main .java:159)在sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java. lang.reflect.Method.invoke(Method.java :606)在org.apache.hadoop.util.RunJar.main(RunJar.java:187)

loadData.pig看起来像:

A = load '/ajasing/input/input.txt' USING PigStorage('\t', '-noschema');
store A into '/ajasing/output1444/input1444.txt';
Run Code Online (Sandbox Code Playgroud)

我正在运行Pig版本0.11.1,hadoop版本1.0.3和AMI版本2.4.6.

如果我在本地执行这个猪,即通过在EMR集群上本地复制猪脚本,它工作正常.但是,如果猪脚本源是s3,它会因上述错误而失败.

请告诉我这里有什么问题.

小智 5

你在加载任何.jar文件吗?我只是遇到了我通过改变解决的确切问题

REGISTER /home/hadoop/mongo-java-driver-2.11.1.jar;
Run Code Online (Sandbox Code Playgroud)

REGISTER file:/home/hadoop/mongo-java-driver-2.11.1.jar;
Run Code Online (Sandbox Code Playgroud)

以下这篇文章:https: //forums.aws.amazon.com/thread.jspa?messageID = 480997

像魅力一样工作!