我有几个关于oozie 2.3共享库的问题:
目前,我在coordinator.properties中定义了共享库:
oozie.use.system.libpath=true
oozie.libpath=<hdfs_path>
Run Code Online (Sandbox Code Playgroud)
这是我的问题:
共享库被复制到其他数据节点时,有多少数据节点将获得共享库?
共享库是根据协调器作业中的wf数复制到其他数据节点还是每个协调器作业只复制一次?
我为hive脚本创建了一个oozie工作流,以便在表中加载数据.
我的workflow.xml包含 -
<workflow-app xmlns="uri:oozie:workflow:0.4" name="Hive-Table-Insertion">
<start to="InsertData"/>
<action name="InsertData">
<hive xmlns="uri:oozie:hive-action:0.4">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${workflowRoot}/output-data/hive"/>
<mkdir path="${workflowRoot}/output-data"/>
</prepare>
<job-xml>${workflowRoot}/hive-site.xml</job-xml>
<configuration>
<property>
<name>oozie.hive.defaults</name>
<value>${workflowRoot}/hive-site.xml</value>
</property>
</configuration>
<script>load_data.hql</script>
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
Run Code Online (Sandbox Code Playgroud)
我的job.properties文件包含 -
nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
workflowRoot=HiveLoadData
oozie.libpath=${nameNode}/user/oozie/share/lib
oozie.wf.application.path=${nameNode}/user/${user.name}/${workflowRoot}
Run Code Online (Sandbox Code Playgroud)
当我尝试使用命令"oozie job -oozie http:// localhost:11000 / oozie -config /user/oozie/HiveLoadData/job.properties -submit"提交我的工作时,我收到以下错误,
java.io.IOException: configuration is not specified
at org.apache.oozie.cli.OozieCLI.getConfiguration(OozieCLI.java:729)
at org.apache.oozie.cli.OozieCLI.jobCommand(OozieCLI.java:879)
at org.apache.oozie.cli.OozieCLI.processCommand(OozieCLI.java:604)
at org.apache.oozie.cli.OozieCLI.run(OozieCLI.java:577)
at org.apache.oozie.cli.OozieCLI.main(OozieCLI.java:204)
configuration is not …Run Code Online (Sandbox Code Playgroud) 问题: 我们正在尝试在集群的特定主机上运行少量命令.我们选择了SSH Action.我们一直面临这个SSH问题.这可能是什么真正的问题?请指出我的解决方案.
日志:
AUTH_FAILED:无法执行操作[ssh -o PasswordAuthentication = no -o KbdInteractiveDevices = no -o StrictHostKeyChecking = no -o ConnectTimeout = 20 USER@1.2.3.4 mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1 - ssh /] | ErrorStream:警告:永久性地将主机1.2.3.4(RSA)添加到已知主机列表中.权限被拒绝(publickey,gssapi-keyex,gssapi-with-mic,密码).
org.apache.oozie.action.ActionExecutorException:AUTH_FAILED:无法执行操作[ssh -o PasswordAuthentication = no -o KbdInteractiveDevices = no -o StrictHostKeyChecking = no -o ConnectTimeout = 20 user@1.2.3.4 mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1 - ssh /] | ErrorStream:警告:永久性地将1.2.3.4,192.168.34.208(RSA)添加到已知主机列表中.权限被拒绝(publickey,gssapi-keyex,gssapi-with-mic,密码).
at org.apache.oozie.action.ssh.SshActionExecutor.execute(SshActionExecutor.java:589)
at org.apache.oozie.action.ssh.SshActionExecutor.start(SshActionExecutor.java:204)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:211)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:59)
at org.apache.oozie.command.XCommand.call(XCommand.java:277)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:326)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:255)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662) …Run Code Online (Sandbox Code Playgroud) 我试图在Oozie工作流程中聚合一些数据.但是聚合步骤失败.
我在日志中发现了两个兴趣点:第一个是错误(?)似乎反复出现:
容器完成后,它会被终止,但退出时返回非零退出代码143.
它结束了:
2015-05-04 15:35:12,013 INFO [IPC Server handler 7 on 49697] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1430730089455_0009_m_000048_0 is : 0.7231312
2015-05-04 15:35:12,015 INFO [IPC Server handler 19 on 49697] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1430730089455_0009_m_000048_0 is : 1.0
Run Code Online (Sandbox Code Playgroud)
然后当它被Application Master杀死时:
2015-05-04 15:35:13,831 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1430730089455_0009_m_000048_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Run Code Online (Sandbox Code Playgroud)
第二个兴趣点是完全崩溃工作的实际错误,这发生在reduce阶段,不确定这两个是否相关:
2015-05-04 15:35:28,767 INFO [IPC Server handler 20 …Run Code Online (Sandbox Code Playgroud) 我对Hadoop很新,我目前已经分配了一个项目
"实施高级作业控制框架,以帮助链接多个Map-Reduce作业,即调查/改进现有的org.apache.hadoop.mapred.jobcontrol包."
该项目在http://wiki.apache.org/hadoop/ProjectSuggestions#research_projects上随机创意下的项目建议页面上列出
我的困惑是,我是否必须构建Oozie的高级版本(我认为这是一个链接多个工作的工作控制框架)或类似的东西,或者这意味着完全不同的东西.
我错过了什么?
我们正在oozie中运行工作流程.它包含两个操作:第一个是在hdfs中生成文件的map reduce作业,第二个是应该将文件中的数据复制到数据库的作业.
这两个部分都已成功完成,但是oozie在结尾处抛出异常,将其标记为失败的进程.
这是例外:
2014-05-20 17:29:32,242 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:lpinsight (auth:SIMPLE) cause:java.io.IOException: Filesystem closed
2014-05-20 17:29:32,243 WARN org.apache.hadoop.mapred.Child: Error running child
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565)
at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:589)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at org.apache.hadoop.util.LineReader.close(LineReader.java:149)
at org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:243)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:222)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:421)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Run Code Online (Sandbox Code Playgroud)
2014-05-20 17:29:32,256 INFO org.apache.hadoop.mapred.Task:Runnning cleanup for the task
任何的想法 ?
我按照http://gauravkohli.com/2014/08/26/apache-oozie-installation-on-hadoop-2-4-1/中的步骤在Linux机器上 安装了oozie 4.1.0
hadoop version - 2.6.0
maven - 3.0.4
pig - 0.12.0
Run Code Online (Sandbox Code Playgroud)
群集设置 -
MASTER NODE runnig - Namenode,Resourcemanager,proxyserver.
SLAVE NODE正在运行 -Datanode,Nodemanager.
当我运行单个工作流程时,工作意味着它成功.但是当我尝试运行多个Workflow作业时,即两个作业都处于接受状态

检查错误日志,我深入研究了问题,
014-12-24 21:00:36,758 [JobControl] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 172.16.***.***/172.16.***.***:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-12-25 09:30:39,145 [communication thread] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 172.16.***.***/172.16.***.***:52406. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2014-12-25 09:30:39,199 [communication thread] INFO org.apache.hadoop.mapred.Task - Communication exception: …Run Code Online (Sandbox Code Playgroud) 我正在为我的Java Action使用capture-output选项.我在下游操作中使用的值.哪个工作正常.当我执行oozie作业时,框架也会获取值,而不再运行Java操作.
我想知道这些值存储在哪里?
提前致谢.
我编写了一个Oozie工作流,它运行BASH shell脚本来执行一些配置单元查询并对结果执行一些操作.该脚本运行但在访问某些HDFS数据时会引发权限错误.提交Oozie工作流的用户具有权限,但脚本作为yarn用户运行.
是否可以让Oozie以提交工作流程的用户身份执行脚本?Hive和Java操作都作为提交的用户执行,只是shell的行为不同.
这是我的Oozie动作的粗略轮廓
<action name="start_action"
retry-max="12"
retry-interval="600">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>${WorkflowRoot}/hive-site.xml</job-xml>
<exec>script.sh</exec>
<file>${WorkflowRoot}/script.sh</file>
<capture-output />
</shell>
<ok to="next_action"/>
<error to="send_email"/>
</action>
Run Code Online (Sandbox Code Playgroud)
我正在运行Oozie 4.1.0和HDP 2.1.
我正在使用hue运行一个hive查询throwh oozie ..
我正在通过hue-oozie工作流程创建一个表...
我的工作失败但是当我在hive中检查时表创建了.
日志显示以下错误:
16157 [main] INFO org.apache.hadoop.hive.ql.hooks.ATSHook - Created ATS Hook
2015-09-24 11:05:35,801 INFO [main] hooks.ATSHook (ATSHook.java:<init>(84)) - Created ATS Hook
16159 [main] ERROR org.apache.hadoop.hive.ql.Driver - hive.exec.post.hooks Class not found:org.apache.atlas.hive.hook.HiveHook
2015-09-24 11:05:35,803 ERROR [main] ql.Driver (SessionState.java:printError(960)) - hive.exec.post.hooks Class not found:org.apache.atlas.hive.hook.HiveHook
16159 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Hive Internal Error: java.lang.ClassNotFoundException(org.apache.atlas.hive.hook.HiveHook)
java.lang.ClassNotFoundException: org.apache.atlas.hive.hook.HiveHook
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
Run Code Online (Sandbox Code Playgroud)
无法识别问题....
我使用HDP 2.3.1