HDP 平台oozie上spark2节点的使用

发表: 2018-07-11 浏览: 3604

oozie

HDP 上oozie默认使用spark1,不是spark2,需要自己准备，这个可以参考bk_spark-component-guide.pdf p49

日志看起来也有些麻烦，具体执行的内容可能需要查看History Server的日志

1.准备lib

按照说明文当作就可以了，测试下来如果不要要使用hive,应该可以不用hive-site.xml

1. Create a spark2 ShareLib directory under the Oozie ShareLib directory associated with the oozie service user:

hdfs dfs -mkdir /user/oozie/share/lib/lib_<ts>/spark2

2. Copy spark2 jar files from the spark2 jar directory to the Oozie spark2 ShareLib:

hdfs dfs -put /usr/hdp/current/spark2/jars/* /user/oozie/share/lib/lib_<ts>/spark2/

3. Copy the oozie-sharelib-spark jar file from the spark ShareLib directory to the spark2 ShareLib directory:

hdfs dfs -cp /user/oozie/share/lib/lib_<ts>/spark/oozie-sharelib-spark-4.2.0.2.6.4.0-91.jar /user/oozie/share/lib/lib_<ts>/spark2/

4. Copy the hive-site.xml file from the current spark ShareLib to the spark2

5. Copy Python libraries to the spark2 ShareLib:

hdfs dfs -put /usr/hdp/current/spark2-client/python/lib/py* /user/oozie/share/lib/lib_<ts>/spark2/

6. Run the Oozie sharelibupdate command:

oozie admin –sharelibupdate

2.配置job 节点和workflow参数

增加spark节点，但是我们lib是spark2需要在workflow提交的时候指定oozie.action.sharelib.for.spark=spark2

3.使用中的一些问题

3.1 运行文件如python文件没有使用hdfs路径

java.io.FileNotFoundException: File file:/user/oozie/workflow/wkf_example/wordcount_noparm.py does not exist

3.2 spark2的一些lib和其它目录有冲突

错误信息

2018-07-03 14:52:51,542  WARN SparkActionExecutor:523 - SERVER[bigdatanode0] USER[oozie] GROUP[-] TOKEN[] APP[wkf_test21] JOB[0000054-180702162526364-oozie-oozi-W] ACTION[0000054-180702162526364-oozie-oozi-W@spark_1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Attempt to add (hdfs://bigdatamster:8020/user/oozie/share/lib/lib_20180628161326/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.

2018-07-03 14:52:51,543  WARN SparkActionExecutor:523 - SERVER[bigdatanode0] USER[oozie] GROUP[-] TOKEN[] APP[wkf_test21] JOB[0000054-180702162526364-oozie-oozi-W] ACTION[0000054-180702162526364-oozie-oozi-W@spark_1] Launcher exception: Attempt to add (hdfs://bigdatamster:8020/user/oozie/share/lib/lib_20180628161326/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.

java.lang.IllegalArgumentException: Attempt to add (hdfs://bigdatamster:8020/user/oozie/share/lib/lib_20180628161326/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.

    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13$$anonfun$apply$6.apply(Client.scala:619)

    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$13$$anonfun$apply$6.apply(Client.scala:610)

处理命令    

    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/aws*  /user/oozie/share/lib/lib_20180628161326/spark2_old/

    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/azure*  /user/oozie/share/lib/lib_20180628161326/spark2_old/

    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/hadoop-aws*  /user/oozie/share/lib/lib_20180628161326/spark2_old/

    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/hadoop-azure*  /user/oozie/share/lib/lib_20180628161326/spark2_old/

    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/spark2/ok*  /user/oozie/share/lib/lib_20180628161326/spark2_old/

    hadoop fs -mv /user/oozie/share/lib/lib_20180628161326/oozie/jackson* /user/oozie/share/lib/lib_20180628161326/oozie_old/

3.3 资源无法分配报Exit code is 143

这是一个资源无法分配的错误

错误信息

org.apache.spark.SparkException: Application application_1530177734281_1853 finished with failed status

    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)

    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)

    at org.apache.spark.deploy.yarn.Client.main(Client.scala)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

察看Log Type: syslog 明细

2018-07-03 15:07:25,568 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1530177734281_1852_m_000000_0: Container killed by the ApplicationMaster.

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143.

可以设置spark-submit参数解决，但是长期还是要调整yarn参数

--num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1

3.4 没有配置oozie.action.sharelib.for.spark

oozie.action.sharelib.for.spark=spark2

错误信息

org.apache.oozie.action.hadoop.OozieActionConfiguratorException: Missing py4j and/or pyspark zip files. Please add them to the lib folder or to the Spark sharelib.

    at org.apache.oozie.action.hadoop.SparkMain.getMatchingPyFile(SparkMain.java:286)

    at org.apache.oozie.action.hadoop.SparkMain.createPySparkLibFolder(SparkMain.java:267)

3.5 输出目录问题

由于测试使用了这个语句

counts.saveAsTextFile("hdfs://bigdatamster:8020/sparktest/admindata/aafolder")

如果目录存在会报错，这个需要查询spark的日志，yarn的日志没有具体信息

错误信息

734281_2164/container_e06_1530177734281_2164_02_000001/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__

  File "/hadoop/yarn/local/usercache/oozie/appcache/application_1530177734281_2164/container_e06_1530177734281_2164_02_000001/pyspark.zip/pyspark/sql/utils.py", line 63, in deco

  File "/hadoop/yarn/local/usercache/oozie/appcache/application_1530177734281_2164/container_e06_1530177734281_2164_02_000001/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value

py4j.protocol.Py4JJavaError: An error occurred while calling o71.saveAsTextFile.

: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://bigdatamster:8020/sparktest/admindata/aafolder already exists

0 个评论

要回复文章请先登录或注册