您好,登錄后才能下訂單哦!
hive獨立模式安裝--jared
該部署筆記是在2014年年初記錄,現在放在51cto上。
有關hadoop基礎環境的搭建請參考如下鏈接:
http://ganlanqing.blog.51cto.com/6967482/1387210
JDK版本:jdk-7u51-linux-x64.rpm
hadoop版本:hadoop-0.20.2.tar.gz
hive版本:hive-0.12.0.tar.gz
mysql驅動包版本:mysql-connector-java-5.1.7-bin.jar
1.安裝mysql環境
[root@master ~]# yum install mysql mysql-server -y
[root@master ~]# /etc/init.d/mysqld start
[root@master ~]# mysqladmin -uroot password "123456"
[root@master ~]# mysql -uroot -p123456
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.73 Source distribution
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create user 'hive' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
mysql> GRANT ALL PRIVILEGES ON *.* TO 'hive'@'master' IDENTIFIED BY '123456' WITH GRANT OPTION;
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> exit
Bye
[root@master ~]#
#########################
缺少hive用戶創建hive庫的步驟!
#########################
2.下載hive安裝包
[jared@master conf]$ wget http://mirror.bjtu.edu.cn/apache/hive/hive-0.12.0/hive-0.12.0-bin.tar.gz
[jared@master conf]$ gzip -d hive-0.12.0.tar.gz
[jared@master conf]$ tar -xf hive-0.12.0.tar
[jared@master conf]$ mv hive-0.12.0 hive
3.設置環境變量
[root@master ~]# vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_51
export HIVE_HOME=/home/jared/hive
export HIVE_CONF_DIR=/home/jared/hive/conf
export HIVE_LIB=$HIVE_HOME/lib
export HADOOP_INSTALL=/home/jared/hadoop
export HBASE_INSTALL=/home/jared/hbase
export PATH=$PATH:$HADOOP_INSTALL/bin:$HBASE_INSTALL/bin:$HIVE_HOME/bin
[root@master ~]# source /etc/profile
[root@master ~]# exit
logout
[jared@master conf]$ pwd
/home/jared/hive/conf
[jared@master conf]$ source /etc/profile
[jared@master conf]$ echo $HIVE_HOME
/home/jared/hive
[jared@master conf]$ cp hive-env.sh.template hive-env.sh
[jared@master conf]$ vim hive-env.sh
export HADOOP_HEAPSIZE=1024
HADOOP_HOME=/home/jared/hadoop
export HIVE_CONF_DIR=/home/jared/hive/conf
export HIVE_AUX_JARS_PATH=/home/jared/hive/lib
[jared@master conf]$ source hive-env.sh
4.配置hive-site.xml
[jared@master conf]$ vim hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
</configuration>
5.把mysql的驅動包拷貝到Hive安裝路徑下的lib目錄
[jared@master ~]$ wget http://cdn.mysql.com/archives/mysql-connector-java-5.1/mysql-connector-java-5.1.7.tar.gz
[jared@master ~]$ tar -zxvf mysql-connector-java-5.1.7.tar.gz
[jared@master ~]$ cd mysql-connector-java-5.1.7
[jared@master ~]$ cp mysql-connector-java-5.1.7-bin.jar /home/jared/hive/lib/
6.CLI訪問接口:shell
[jared@master ~]$ hive
Logging initialized using configuration in jar:file:/home/jared/hive/lib/hive-common-0.12.0.jar!/hive-log4j.properties
hive> show databases;
OK
default
Time taken: 11.506 seconds, Fetched: 1 row(s)
hive> create table test (key string);
OK
Time taken: 2.805 seconds
hive> show tables;
OK
test
Time taken: 0.091 seconds, Fetched: 1 row(s)
hive>
7.本地上傳數據測試
本地文件信息
文件名:access.log
大小:11M
[jared@master input]$ du -h access.log
11M access.log
[jared@master input]$ cat access.log |wc -l
60000
數據結構:
[jared@master input]$ cat access.log
1393960136.926 0 212.92.231.166 TCP_DENIED/403 1256 GET http://221.181.39.85/phpTest/zologize/axa.php - NONE/- text/html "-" "-" -
1393960137.600 0 212.92.231.166 TCP_DENIED/403 1264 GET http://221.181.39.85/phpMyAdmin/scripts/setup.php - NONE/- text/html "-" "-" -
1393960138.274 0 212.92.231.166 TCP_DENIED/403 1250 GET http://221.181.39.85/pma/scripts/setup.php - NONE/- text/html "-" "-" -
1393960138.946 0 212.92.231.166 TCP_DENIED/403 1258 GET http://221.181.39.85/myadmin/scripts/setup.php - NONE/- text/html "-" "-" -
1393960143.624 1 127.0.0.1 TCP_HIT/200 22874 GET http://www.chinacache.com/p_w_picpaths/logo.gif - NONE/- p_w_picpath/gif "-" "-" -
1393960143.628 1 127.0.0.1 TCP_HIT/200 22874 GET http://www.chinacache.com/p_w_picpaths/logo.gif - NONE/- p_w_picpath/gif "-" "-" -
1393960144.636 2 127.0.0.1 TCP_HIT/200 22874 GET http://www.chinacache.com/p_w_picpaths/logo.gif - NONE/- p_w_picpath/gif "-" "-" -
1393960145.643 2 127.0.0.1 TCP_HIT/200 22874 GET http://www.chinacache.com/p_w_picpaths/logo.gif - NONE/- p_w_picpath/gif "-" "-" -
1393982948.194 1 112.5.4.63 TCP_HIT/200 467 GET http://cu005.www.duba.net/duba/2011/kcomponent/kcom_commonfast/53a08fed.dat - NONE/- text/plain "-" "-" -
1393982948.246 0 218.203.54.25 TCP_HIT/200 462 GET http://cu005.www.duba.net/duba/2011/kcomponent/kcom_kvm2/indexkcom_kvm2.dat - NONE/- text/plain "-" "-" -
1393982948.258 0 218.203.54.25 TCP_HIT/200 467 GET http://cu005.www.duba.net/duba/2011/kcomponent/kcom_commonfast/53a08fed.dat - NONE/- text/plain "-" "-" -
建立表結構
hive> CREATE TABLE CU005_LOG (TIMES_TAMP STRING,RES_TIME INT,FC_IP STRING,FC_HANDLING STRING,FILE_SIZE INT,REQ_METHOD STRING,URL STRING,USER STRING,BACK_SRC STRING,MIME STRING,REFERER STRING,UA STRING,COOKIE STRING )ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' STORED AS TEXTFILE;
hive> show tables;
OK
cu005_log
Time taken: 0.08 seconds, Fetched: 1 row(s)
hive>desc cu005_log;
OK
times_tamp string None
res_time int None
fc_ip string None
fc_handling string None
file_size int None
req_method string None
url string None
user string None
back_src string None
mime string None
referer string None
ua string None
cookie string None
Time taken: 0.208 seconds, Fetched: 13 row(s)
hive>
導入本地數據
hive> LOAD DATA LOCAL INPATH '/home/jared/input/access.log' OVERWRITE INTO TABLE CU005_LOG;
Copying data from file:/home/jared/input/access.log
Copying file: file:/home/jared/input/access.log
Loading data to table default.cu005_log
Table default.cu005_log stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 10872324, raw_data_size: 0]
OK
Time taken: 1.811 seconds
在hadoop集群中的存放位置
hdfs://master:9000/user/hive/warehouse/cu005_log/access.log
[jared@master ~]$ hadoop dfs -ls /user/hive/warehouse/
Found 1 items
drwxr-xr-x - jared supergroup 0 2014-03-06 18:31 /user/hive/warehouse/cu005_log
查詢
hive> select count(*) from cu005_log;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Starting Job = job_201402230829_0003, Tracking URL = http://master:50030/jobdetails.jsp?jobid=job_201402230829_0003
Kill Command = /home/jared/hadoop/bin/../bin/hadoop job -kill job_201402230829_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2014-03-06 17:18:07,994 Stage-1 map = 0%, reduce = 0%
2014-03-06 17:18:32,121 Stage-1 map = 100%, reduce = 0%
2014-03-06 17:18:44,200 Stage-1 map = 100%, reduce = 33%
2014-03-06 17:18:47,220 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201402230829_0003
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 HDFS Read: 10872324 HDFS Write: 6 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
60000
Time taken: 77.157 seconds, Fetched: 1 row(s)
hive>
Web界面訪問
詳細請參考https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface
需要在$HIVE_HOME/conf/hive_site.xml配置文件中添加一些配置項,添加字段如下:
<property>
<name>hive.hwi.listen.host</name>
<value>192.168.255.25</value>
<description>This is the host address the Hive Web Interface will listen on</description>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9999</value>
<description>This is the port the Hive Web Interface will listen on</description>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>lib/hive-hwi-0.12.0.war</value>
<description>This is the WAR file with the jsp content for Hive Web Interface</description>
</property>
啟動hive hwi
后臺啟動
[jared@master ~]$ nohup hive --service hwi > /dev/null 2> /dev/null &
瀏覽器訪問 http://192.168.255.25:9999/hwi/
使用參考http://www.cnblogs.com/gpcuster/archive/2010/02/25/1673480.html
HWI與CLI對比
如果使用過cli的朋友看了上面的介紹,一定會發現一個很嚴重的問題:執行的過程沒有提示。我們不知道某一個查詢執行是什么時候結束的。
總結一下HWI與CLI對比的優缺點:
優點:HWI支持瀏覽器的方式瀏覽,方便直觀。
缺點:無執行過程提示。
我個人還是更傾向于使用cli的方式
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。