小编the*_*ior的帖子

Oozie shell脚本动作

我正在探索Oozie管理Hadoop工作流的功能.我正在尝试设置一个shell动作来调用一些配置单元命令.我的shell脚本hive.sh看起来像:

#!/bin/bash
hive -f hivescript

Run Code Online (Sandbox Code Playgroud)

hive脚本(已经独立测试)创建了一些表等等.我的问题是在哪里保留hivescript,然后如何从shell脚本中引用它.

我尝试了两种方法,首先使用本地路径,hive -f /local/path/to/file并使用上面的相对路径hive -f hivescript,在这种情况下,我将我的hivescript保存在oozie应用程序路径目录中(与hive.sh和workflow.xml相同)并设置它通过workflow.xml转到分布式缓存.

使用这两种方法,我收到错误消息: "Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]"在oozie Web控制台上.另外我尝试在shell脚本中使用hdfs路径,据我所知,这不起作用.

我的job.properties文件:

nameNode=hdfs://sandbox:8020
jobTracker=hdfs://sandbox:50300   
queueName=default
oozie.libpath=${nameNode}/user/oozie/share/lib
oozie.use.system.libpath=true
oozieProjectRoot=${nameNode}/user/sandbox/poc1
appPath=${oozieProjectRoot}/testwf
oozie.wf.application.path=${appPath}

Run Code Online (Sandbox Code Playgroud)

和workflow.xml:

<shell xmlns="uri:oozie:shell-action:0.1">

    <job-tracker>${jobTracker}</job-tracker>

    <name-node>${nameNode}</name-node>

    <configuration>

        <property>

            <name>mapred.job.queue.name</name>

            <value>${queueName}</value>

        </property>

    </configuration>

    <exec>${appPath}/hive.sh</exec>

    <file>${appPath}/hive.sh</file> 

    <file>${appPath}/hive_pill</file>

</shell>

<ok to="end"/>

<error to="end"/>

</action>

<end name="end"/>

Run Code Online (Sandbox Code Playgroud)

我的目标是使用oozie通过shell脚本调用hive脚本,请提出你的建议.

bash hadoop hive oozie

the*_*ior

2014 03-14

5
推荐指数

2
解决办法

1万
查看次数