HDP 平台oozie上spark2节点的使用

浏览: 3604

HDP 上oozie默认使用spark1,不是spark2,需要自己准备,这个可以参考bk_spark-component-guide.pdf   p49

日志看起来也有些麻烦,具体执行的内容可能需要查看History Server的日志

1.准备lib

按照说明文当作就可以了,测试下来如果不要要使用hive,应该可以不用hive-site.xml

1. Create a spark2 ShareLib directory under the Oozie ShareLib directory associated with the oozie service user:
hdfs dfs -mkdir /user/oozie/share/lib/lib_<ts>/spark2
2. Copy spark2 jar files from the spark2 jar directory to the Oozie spark2 ShareLib:
hdfs dfs -put /usr/hdp/current/spark2/jars/* /user/oozie/share/lib/lib_<ts>/spark2/
3. Copy the oozie-sharelib-spark jar file from the spark ShareLib directory to the spark2 ShareLib directory:
hdfs dfs -cp /user/oozie/share/lib/lib_<ts>/spark/oozie-sharelib-spark-4.2.0.2.6.4.0-91.jar /user/oozie/share/lib/lib_<ts>/spark2/
4. Copy the hive-site.xml file from the current spark ShareLib to the spark2
5. Copy Python libraries to the spark2 ShareLib:
hdfs dfs -put /usr/hdp/current/spark2-client/python/lib/py* /user/oozie/share/lib/lib_<ts>/spark2/
6. Run the Oozie sharelibupdate command:
oozie admin –sharelibupdate

2.配置job 节点和workflow参数

增加spark节点,但是我们lib是spark2需要在workflow提交的时候指定oozie.action.sharelib.for.spark=spark2

3.使用中的一些问题

3.1 运行文件如python文件没有使用hdfs路径

java.io.FileNotFoundException: File file:/user/oozie/workflow/wkf_example/wordcount_noparm.py does not exist

3.2 spark2的一些lib和其它目录有冲突

错误信息
2018-07-03 14:52:51,542  WARN SparkActionExecutor:523 - SERVER[bigdatanode0] USER[oozie] GROUP[-] TOKEN[] APP[wkf_test21] JOB[0000054-180702162526364-oozie-oozi-W] ACTION[0000054-180702162526364-oozie-oozi-W@spark_1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Attempt to add (hdfs://bigdatamster:8020/user/oozie/share/lib/lib_20180628161326/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.
2018-07-03 14:52:51,543  WARN SparkActionExecutor:523 - SERVER[bigdatanode0] USER[oozie] GROUP[-] TOKEN[] APP[wkf_test21] JOB[0000054-180702162526364-oozie-oozi-W] ACTION[0000054-180702162526364-oozie-oozi-W@spark_1] Launcher exception: Attempt to add (hdfs://bigdatamster:8020/user/oozie/share/lib/lib_20180628161326/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.
java.lang.IllegalArgumentException: Attempt to add (hdfs://bigdatamster:8020/user/oozie/share/lib/lib_20180628161326/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.
    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13$$anonfun$apply$6.apply(Client.scala:619)
    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13$$anonfun$apply$6.apply(Client.scala:610)
处理命令    
    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/aws*  /user/oozie/share/lib/lib_20180628161326/spark2_old/
    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/azure*  /user/oozie/share/lib/lib_20180628161326/spark2_old/
    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/hadoop-aws*  /user/oozie/share/lib/lib_20180628161326/spark2_old/
    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/hadoop-azure*  /user/oozie/share/lib/lib_20180628161326/spark2_old/
    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/ok*  /user/oozie/share/lib/lib_20180628161326/spark2_old/
    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/oozie/jackson* /user/oozie/share/lib/lib_20180628161326/oozie_old/
    

3.3  资源无法分配报Exit code is 143

这是一个资源无法分配的错误

错误信息
org.apache.spark.SparkException: Application application_1530177734281_1853 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
察看Log Type: syslog 明细
2018-07-03 15:07:25,568 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1530177734281_1852_m_000000_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143.

可以设置spark-submit参数解决,但是长期还是要调整yarn参数

--num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 

3.4 没有配置oozie.action.sharelib.for.spark

oozie.action.sharelib.for.spark=spark2

错误信息
org.apache.oozie.action.hadoop.OozieActionConfiguratorException: Missing py4j and/or pyspark zip files. Please add them to the lib folder or to the Spark sharelib.
    at org.apache.oozie.action.hadoop.SparkMain.getMatchingPyFile(SparkMain.java:286)
    at org.apache.oozie.action.hadoop.SparkMain.createPySparkLibFolder(SparkMain.java:267)

 3.5 输出目录问题    

由于测试使用了这个语句

counts.saveAsTextFile("hdfs://bigdatamster:8020/sparktest/admindata/aafolder")

如果目录存在会报错,这个需要查询spark的日志,yarn的日志没有具体信息

错误信息
734281_2164/container_e06_1530177734281_2164_02_000001/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/hadoop/yarn/local/usercache/oozie/appcache/application_1530177734281_2164/container_e06_1530177734281_2164_02_000001/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/hadoop/yarn/local/usercache/oozie/appcache/application_1530177734281_2164/container_e06_1530177734281_2164_02_000001/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o71.saveAsTextFile.
: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://bigdatamster:8020/sparktest/admindata/aafolder already exists
推荐 3
本文由 seng 创作,采用 知识共享署名-相同方式共享 3.0 中国大陆许可协议 进行许可。
转载、引用前需联系作者,并署名作者且注明文章出处。
本站文章版权归原作者及原出处所有 。内容为作者个人观点, 并不代表本站赞同其观点和对其真实性负责。本站是一个个人学习交流的平台,并不用于任何商业目的,如果有任何问题,请及时联系我们,我们将根据著作权人的要求,立即更正或者删除有关内容。本站拥有对此声明的最终解释权。

0 个评论

要回复文章请先登录注册