您好,登錄后才能下訂單哦!
本篇文章為大家展示了如何實現Spark on Yarn配置日志Web UI,內容簡明扼要并且容易理解,絕對能使你眼前一亮,通過這篇文章的詳細介紹希望你能有所收獲。
1.進入spark目錄和配置文件
[root@sht-sgmhadoopnn-01 ~]# cd /root/learnproject/app/spark/conf
[root@sht-sgmhadoopnn-01 conf]# cp spark-defaults.conf.template spark-defaults.conf
2.創建spark-history的存儲日志路徑為hdfs上(當然也可以在linux文件系統上)
[root@sht-sgmhadoopnn-01 conf]# hdfs dfs -ls /
Found 3 items
drwxr-xr-x - root root 0 2017-02-14 12:43 /spark
drwxrwx--- - root root 0 2017-02-14 12:58 /tmp
drwxr-xr-x - root root 0 2017-02-14 12:58 /user
You have new mail in /var/spool/mail/root
[root@sht-sgmhadoopnn-01 conf]# hdfs dfs -ls /spark
Found 1 items
drwxrwxrwx - root root 0 2017-02-15 21:44 /spark/checkpointdata
[root@sht-sgmhadoopnn-01 conf]# hdfs dfs -mkdir /spark/historylog
#在HDFS中創建一個目錄,用于保存Spark運行日志信息。Spark History Server從此目錄中讀取日志信息
3.配置
[root@sht-sgmhadoopnn-01 conf]# vi spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.compress true
spark.eventLog.dir hdfs://nameservice1/spark/historylog
spark.yarn.historyServer.address 172.16.101.55:18080
#spark.eventLog.dir保存日志相關信息的路徑,可以是hdfs://開頭的HDFS路徑,也可以是file://開頭的本地路徑,都需要提前創建
#spark.yarn.historyServer.address : Spark history server的地址(不加http://).
這個地址會在Spark應用程序完成后提交給YARN RM,然后可以在RM UI上點擊鏈接跳轉到history server UI上.
4.添加SPARK_HISTORY_OPTS參數
[root@sht-sgmhadoopnn-01 conf]# vi spark-env.sh
#!/usr/bin/env bash
export SCALA_HOME=/root/learnproject/app/scala
export JAVA_HOME=/usr/java/jdk1.8.0_111
export SPARK_MASTER_IP=172.16.101.55
export SPARK_WORKER_MEMORY=1g
export SPARK_PID_DIR=/root/learnproject/app/pid
export HADOOP_CONF_DIR=/root/learnproject/app/hadoop/etc/hadoop
export SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=hdfs://mycluster/spark/historylog \
-Dspark.history.ui.port=18080 \
-Dspark.history.retainedApplications=20"
5.啟動服務和查看
[root@sht-sgmhadoopnn-01 spark]# ./sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /root/learnproject/app/spark/logs/spark-root-org.apache.spark.deploy.history.HistoryServer-1-sht-sgmhadoopnn-01.out
[root@sht-sgmhadoopnn-01 ~]# jps
28905 HistoryServer
30407 ProdServerStart
30373 ResourceManager
30957 NameNode
16949 Jps
30280 DFSZKFailoverController
31445 JobHistoryServer
[root@sht-sgmhadoopnn-01 ~]# ps -ef|grep spark
root 17283 16928 0 21:42 pts/2 00:00:00 grep spark
root 28905 1 0 Feb16 ? 00:09:11 /usr/java/jdk1.8.0_111/bin/java -cp /root/learnproject/app/spark/conf/:/root/learnproject/app/spark/jars/*:/root/learnproject/app/hadoop/etc/hadoop/ -Dspark.history.fs.logDirectory=hdfs://mycluster/spark/historylog -Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=20 -Xmx1g org.apache.spark.deploy.history.HistoryServer
You have new mail in /var/spool/mail/root
[root@sht-sgmhadoopnn-01 ~]# netstat -nlp|grep 28905
tcp 0 0 0.0.0.0:18080 0.0.0.0:* LISTEN 28905/java
[root@sht-sgmhadoopnn-01 ~]#
上述內容就是如何實現Spark on Yarn配置日志Web UI,你們學到知識或技能了嗎?如果還想學到更多技能或者豐富自己的知識儲備,歡迎關注億速云行業資訊頻道。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。