我正在尝试运行一个执行hive脚本的简单工作流程.这个hive脚本只调用join(表非常大); 一旦hive脚本执行结束,我就希望看到工作流状态从RUNNING变为成功,但这不会发生.
这是工作流日志的内容:
2016-05-31 15:52:34,590 WARN
org.apache.oozie.action.hadoop.HiveActionExecutor:
SERVER[hadoop02] U
SER[scapp]
GROUP[-]
TOKEN[]
APP[wf-sqoop-hive-agreement]
JOB[0000001-160531143657136-oozie-oozi-W]
ACTION[0000001-160531143657136-oozie-oozi-W@hive-query-agreement] Launcher
ERROR, reason: Main class [org.apache.oozie.action.hadoop.HiveMain], exception invoking main(), Output data exceeds its limit [2048] 2016-05-31 15:52:34,591
WARN org.apache.oozie.action.hadoop.HiveActionExecutor:
SERVER[hadoop02]
USER[scapp]
GROUP[-]
TOKEN[]
APP[wf-sqoop-hive-agreement]
JOB[0000001-160531143657136-oozie-oozi-W]
ACTION[0000001-160531143657136-oozie-oozi-W@hive-query-agreement]
Launcher exception: Output data exceeds its limit [2048]
org.apache.oozie.action.hadoop.LauncherException: Output data exceeds its limit [2048]
at org.apache.oozie.action.hadoop.LauncherMapper.getLocalFileContentStr(LauncherMapper.java:415)
at org.apache.oozie.action.hadoop.LauncherMapper.handleActionData(LauncherMapper.java:391)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:275) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Run Code Online (Sandbox Code Playgroud)
@BorderStark我不认为该属性以MB为单位表示其大小.大小是"字符",即根据oozie-default.xml文件中的以下条目的字节.
<property>
<name>oozie.action.max.output.data</name>
<value>2048</value>
<description>
Max size in characters for output data.
</description>
</property>
Run Code Online (Sandbox Code Playgroud)
我认为执行 HIVE 查询会产生巨大的输出,并且它不会被重定向到某个地方。
我建议您的选择查询的输出应该进入 HDFS 中的某个位置,因为您需要将选择查询的输出重定向到一些外部/内部 HIVE 表。
| 归档时间: |
|
| 查看次数: |
8440 次 |
| 最近记录: |