HOW TO USE OOIZE

大数据调度框架 ooize 的基本使用以及图解

Posted by LSG on August 25, 2019

Oozie是大数据四大协作框架之一——任务调度框架,另外三个分别为数据转换工具Sqoop,文件收集库框架Flume,大数据WEB工具Hue。

一. 安装和配置

配置邮件: image.png

二 .使用和常用命令

2.1 验证正确性

  1. 验证wokflow.xml :
    oozie validate /appcom/apps/hduser0401/mbl_webtrends/workflow.xml

  2. 查看oozie服务状态: oozie admin -oozie http://localhost:11000/oozie -status

2.2 执行,准备,提交,直接运行任务

  1. 提交作业,作业进入PREP状态 submit oozie job -oozie http://localhost:11000/oozie -config job.properties -submit 命令行出现: job: jobID

  2. 执行已提交的作业 start oozie job -ooziehttp://localhost:11000/oozie -start jobID

  3. 直接运行作业 run ->一般使用该方式 oozie job -oozie http://localhost:11000/oozie -config job.properties -run

  4. 挂起和恢复 挂起->oozie job -suspend 0000004-180119141609585-oozie-hado-C
    恢复->oozie job -resume 0000004-180119141609585-oozie-hado-C

  5. 杀死作业 oozie job -oozie http://localhost:11000/oozie -kill 14-20090525161321-oozie-joe

  6. 改变作业参数,不能修改killed状态的作业 oozie job -oozie http://localhost:11000/oozie -change 14-20090525161321-oozie-joe -value endtime=2011-12-01T05:00Z;concurrency=100;2011-10-01T05:00Z

  7. 重新运行作业
    oozie job -rerun 0006360-160701092146726-oozie-hado-C -refresh -action 477-479 oozie job -rerun 0006360-160701092146726-oozie-hado-C -D oozie.wf.rerun.failnodes=false

  8. 检查作业状态
    oozie job -oozie http://localhost:11000/oozie -info 14-20090525161321-oozie-joe

  9. 查看日志
    oozie job -oozie http://localhost:11000/oozie -log 14-20090525161321-oozie-joe

  10. 提交pig作业
    oozie pig -oozie http://localhost:11000/oozie -file pigScriptFile -config job.properties -X -param_file params

  11. 提交MR作业 oozie mapreduce -oozie http://localhost:11000/oozie -config job.properties

  12. 查看共享库 : oozie admin -shareliblist sqoop

  13. 动态更新共享库 : oozie admin -sharelibupdate

三. wookflow语法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
<workflow-app xmlns="uri:oozie:workflow:0.2" name="Test">
    <start to="GoodsNameGroup"/>
	  <action name="GoodsNameGroup">
                <spark xmlns="uri:oozie:spark-action:0.2">
                        <job-tracker>${jobTracker}</job-tracker>
                        <name-node>${nameNode}</name-node>
                        <job-xml>${workflowAppUri}/hive-site.xml</job-xml>
                        <configuration>
                                <property>
                                        <name>oozie.action.sharelib.for.spark</name>
                                        <value>spark2 </value>
                                </property>
                                <property>
                                        <name>oozie.use.system.libpath</name>
                                        <value>true</value>
                                </property>
                                <property>
                                        <name>mapred.job.queue.name</name>
                                        <value>${queueName}</value>
                                </property>
                                <property>
                                        <name>tez.queue.name</name>
                                        <value>${queueName}</value>
                                </property>

                        </configuration>
                        <master>yarn-cluster</master>
                        <mode>cluster</mode>
                        <name>GoodsNameGroup</name>
                        <class>com.wangdian.spark.tasks.main.clear.test.GoodsNameGroup</class>
                        <jar>${jarPath}</jar>
                        <spark-opts>--executor-cores 1 --executor-memory 3584M --num-executors 20 --queue ${queueName}</spark-opts>
                </spark>
                <ok to="end"/>
                <error to="an-email"/>
        </action>
	<action name="an-email">
        	<email xmlns="uri:oozie:email-action:0.1">
        		<to>ludengke@wangdian.cn</to>
            		<subject>Email notifications for ${wf:id()} spark-test</subject>
            		<body>The wf ${wf:id()} failed completed.</body>
	        </email>
        	<ok to="kill"/>
		<error to="kill"/>
	</action>
	<kill name="kill">
		<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>  
	</kill>  
	<end name="end"/>
</workflow-app>

四. coordinator语法:

4.1 coordinator.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<coordinator-app name="clear-stat-coordinator" frequency="*/10 0-17,20-23 * * *" start="${start}" end="${end}" timezone="Asia/Shanghai"
                 xmlns="uri:oozie:coordinator:0.2">
    <action>
        <workflow>
            <app-path>${workflowAppUri}</app-path>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>${jobTracker}</value>
                </property>
                <property>
                    <name>nameNode</name>
                    <value>${nameNode}</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
        </workflow>
    </action>
</coordinator-app>

4.2 job.properties

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
nameNode=hdfs://cluster
jobTracker=master
examplesRoot=oozie-apps
queueName=task

# oozie.wf.application.path 标识这是一个workflow任务
# oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/clear-stat-coordinator/workflow.xml
oozie.coord.application.path=${nameNode}/user/${user.name}/${examplesRoot}/clear-stat-coordinator/coordinator.xml
sparkopts=--executor-cores 1 --executor-memory 1g --num-executors 8 --queue ${queueName} --conf spark.yarn.maxAppAttempts=1
workflowAppUri=${nameNode}/user/${user.name}/${examplesRoot}/clear-stat-coordinator
jarPath=lib/SparkTasks.jar
statClass=com.wangdian.spark.tasks.main.stat.ExecuteSparkSqlTask
clearClass=com.wangdian.spark.tasks.main.clear.StatDateClearApiTrade
distinctClass=com.wangdian.spark.tasks.main.clear.HiveGoodsDistinctAndCategoryStatistics
start=2020-01-09T15:00+0800
end=2020-01-11T17:45+0800

五. 使用脚本来节省时间:

由于每次提交任务都要上传到集群上,所以将这些命令自动化

任务提交: ./start.sh 文件包名

1
2
3
4
5
6
7
8
9
10
jarNumStr=`ls -ll /data/bin/oozie-apps/spark-jar-warehouse | grep SparkTasks.jar* | wc -l`
jarNum=${#jarNumStr}
min=1
if [ $jarNum -gt $min ]; then
    echo "SparkTasks.jar类似文件存在多个,请保证一个最新版本"
    exit 1
fi
hadoop fs -rm -r /user/hdfs/oozie-apps/$1
hadoop fs -put  $1 /user/hdfs/oozie-apps/
oozie job -oozie http://hadoop03:11000/oozie -config $1/job.properties -run

六. 常见实例图解:

6.1 执行开始:

image.png

6.2 执行完毕:

image.png

七.ooize执行,长时间处于prep的一次优化

7.1 ooize prep状态

image.png

7.2发现原因是Yarn 中application太多了,导致两个问题:1.页面访问变慢2.ooize提交任务变慢

image.png

7.3 解决办法:

  1. 删除log日志10000多个appl,省出大概800G空间,  (自己集群设置的hdfs的位置)
  2. 删除zk上applacation任务,  (默认位置:  zookeeper中:  rmr /rmstore/ZKRMStateRoot/RMAppRoot )
  3. 重启ooize
  • 删除前

image.png

  • 删除后:

image.png 最后问题解决!

Reference