Oozie是大数据四大协作框架之一——任务调度框架,另外三个分别为数据转换工具Sqoop,文件收集库框架Flume,大数据WEB工具Hue。
一. 安装和配置
配置邮件:
二 .使用和常用命令
2.1 验证正确性
-
验证wokflow.xml :
oozie validate /appcom/apps/hduser0401/mbl_webtrends/workflow.xml
-
查看oozie服务状态:
oozie admin -oozie http://localhost:11000/oozie -status
2.2 执行,准备,提交,直接运行任务
-
提交作业,作业进入PREP状态 submit
oozie job -oozie http://localhost:11000/oozie -config job.properties -submit
命令行出现:job: jobID
-
执行已提交的作业 start
oozie job -ooziehttp://localhost:11000/oozie -start jobID
-
直接运行作业 run ->一般使用该方式
oozie job -oozie http://localhost:11000/oozie -config job.properties -run
-
挂起和恢复 挂起->
oozie job -suspend 0000004-180119141609585-oozie-hado-C
恢复->oozie job -resume 0000004-180119141609585-oozie-hado-C
-
杀死作业
oozie job -oozie http://localhost:11000/oozie -kill 14-20090525161321-oozie-joe
-
改变作业参数,不能修改killed状态的作业
oozie job -oozie http://localhost:11000/oozie -change 14-20090525161321-oozie-joe -value endtime=2011-12-01T05:00Z;concurrency=100;2011-10-01T05:00Z
-
重新运行作业
oozie job -rerun 0006360-160701092146726-oozie-hado-C -refresh -action 477-479
oozie job -rerun 0006360-160701092146726-oozie-hado-C -D oozie.wf.rerun.failnodes=false
-
检查作业状态
oozie job -oozie http://localhost:11000/oozie -info 14-20090525161321-oozie-joe
-
查看日志
oozie job -oozie http://localhost:11000/oozie -log 14-20090525161321-oozie-joe
-
提交pig作业
oozie pig -oozie http://localhost:11000/oozie -file pigScriptFile -config job.properties -X -param_file params
-
提交MR作业
oozie mapreduce -oozie http://localhost:11000/oozie -config job.properties
-
查看共享库 :
oozie admin -shareliblist sqoop
-
动态更新共享库 :
oozie admin -sharelibupdate
三. wookflow语法
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
<workflow-app xmlns="uri:oozie:workflow:0.2" name="Test">
<start to="GoodsNameGroup"/>
<action name="GoodsNameGroup">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>${workflowAppUri}/hive-site.xml</job-xml>
<configuration>
<property>
<name>oozie.action.sharelib.for.spark</name>
<value>spark2 </value>
</property>
<property>
<name>oozie.use.system.libpath</name>
<value>true</value>
</property>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>tez.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<master>yarn-cluster</master>
<mode>cluster</mode>
<name>GoodsNameGroup</name>
<class>com.wangdian.spark.tasks.main.clear.test.GoodsNameGroup</class>
<jar>${jarPath}</jar>
<spark-opts>--executor-cores 1 --executor-memory 3584M --num-executors 20 --queue ${queueName}</spark-opts>
</spark>
<ok to="end"/>
<error to="an-email"/>
</action>
<action name="an-email">
<email xmlns="uri:oozie:email-action:0.1">
<to>ludengke@wangdian.cn</to>
<subject>Email notifications for ${wf:id()} spark-test</subject>
<body>The wf ${wf:id()} failed completed.</body>
</email>
<ok to="kill"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
四. coordinator语法:
4.1 coordinator.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<coordinator-app name="clear-stat-coordinator" frequency="*/10 0-17,20-23 * * *" start="${start}" end="${end}" timezone="Asia/Shanghai"
xmlns="uri:oozie:coordinator:0.2">
<action>
<workflow>
<app-path>${workflowAppUri}</app-path>
<configuration>
<property>
<name>jobTracker</name>
<value>${jobTracker}</value>
</property>
<property>
<name>nameNode</name>
<value>${nameNode}</value>
</property>
<property>
<name>queueName</name>
<value>${queueName}</value>
</property>
</configuration>
</workflow>
</action>
</coordinator-app>
4.2 job.properties
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
nameNode=hdfs://cluster
jobTracker=master
examplesRoot=oozie-apps
queueName=task
# oozie.wf.application.path 标识这是一个workflow任务。
# oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/clear-stat-coordinator/workflow.xml
oozie.coord.application.path=${nameNode}/user/${user.name}/${examplesRoot}/clear-stat-coordinator/coordinator.xml
sparkopts=--executor-cores 1 --executor-memory 1g --num-executors 8 --queue ${queueName} --conf spark.yarn.maxAppAttempts=1
workflowAppUri=${nameNode}/user/${user.name}/${examplesRoot}/clear-stat-coordinator
jarPath=lib/SparkTasks.jar
statClass=com.wangdian.spark.tasks.main.stat.ExecuteSparkSqlTask
clearClass=com.wangdian.spark.tasks.main.clear.StatDateClearApiTrade
distinctClass=com.wangdian.spark.tasks.main.clear.HiveGoodsDistinctAndCategoryStatistics
start=2020-01-09T15:00+0800
end=2020-01-11T17:45+0800
五. 使用脚本来节省时间:
由于每次提交任务都要上传到集群上,所以将这些命令自动化
任务提交:
./start.sh 文件包名
1
2
3
4
5
6
7
8
9
10
jarNumStr=`ls -ll /data/bin/oozie-apps/spark-jar-warehouse | grep SparkTasks.jar* | wc -l`
jarNum=${#jarNumStr}
min=1
if [ $jarNum -gt $min ]; then
echo "SparkTasks.jar类似文件存在多个,请保证一个最新版本"
exit 1
fi
hadoop fs -rm -r /user/hdfs/oozie-apps/$1
hadoop fs -put $1 /user/hdfs/oozie-apps/
oozie job -oozie http://hadoop03:11000/oozie -config $1/job.properties -run
六. 常见实例图解:
6.1 执行开始:
6.2 执行完毕:
七.ooize执行,长时间处于prep的一次优化
7.1 ooize prep状态
7.2发现原因是Yarn 中application太多了,导致两个问题:1.页面访问变慢2.ooize提交任务变慢
7.3 解决办法:
- 删除log日志10000多个appl,省出大概800G空间, (自己集群设置的hdfs的位置)
- 删除zk上applacation任务, (默认位置:
zookeeper中: rmr /rmstore/ZKRMStateRoot/RMAppRoot
) - 重启ooize
- 删除前
- 删除后:
最后问题解决!