91超碰碰碰碰久久久久久综合_超碰av人澡人澡人澡人澡人掠_国产黄大片在线观看画质优化_txt小说免费全本

溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

HBase 數據導入 ImportTsv

發布時間:2020-09-03 04:48:02 來源:網絡 閱讀:2908 作者:JUN_LJ 欄目:大數據

ImportTsv 工具是通過map reduce 完成的。所以要啟動yarn. 工具要使用jar包,所以注意配置classpath。ImportTsv默認是通過hbase api 插入數據的

[hadoop-user@rhel work]$ cat /home/hadoop-user/.bash_profile

# .bash_profile


# Get the aliases and functions

if [ -f ~/.bashrc ]; then

        . ~/.bashrc

fi


# User specific environment and startup programs


PATH=$PATH:$HOME/bin


export PATH


JAVA_HOME=/usr/java/jdk1.8.0_171-amd64

PATH=$PATH:$JAVA_HOME/bin

CLASSPATH=$CLASSPATH:$JAVA_HOME/lib


HADOOP_HOME=/home/hadoop-user/hadoop-2.8.0

PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib


HBASE_HOME=/home/hadoop-user/hbase-2.0.0

PATH=$PATH:$HBASE_HOME/bin

CLASSPATH=$CLASSPATH:$HBASE_HOME/lib


ZOOKEEPER_HOME=/home/hadoop-user/zookeeper-3.4.12

PATH=$PATH:$ZOOKEEPER_HOME/bin


PHOENIX_HOME=/home/hadoop-user/apache-phoenix-5.0.0-alpha-HBase-2.0-bin

PATH=$PATH:$PHOENIX_HOME/bin


export PATH




創建表

hbase(main):033:0> create 'test','cf'

創建要導入的文件

[hadoop-user@rhel work]$ cat /home/hadoop-user/work/sample1.csv

row10,"mjj10"

row11,"mjj11"

row12,"mjj12"

row14,"mjj13"

將文件放入hdfs

[hadoop-user@rhel work]$ hdfs dfs -put /home/hadoop-user/work/sample1.csv /sample1.csv

ImportTsv導入命令

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns=HBASE_ROW_KEY,cf:a test /sample1.csv

注: HBASE_ROW_KEY表示文件rowid的位置,后面是列的定義。這里意思是導入的列為列族為cf,列名為a。要導入的文件是hdfs中/sample1.csv

幫助中的解釋


Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>


Imports the given input directory of TSV data into the specified table.


The column names of the TSV data must be specified using the -Dimporttsv.columns

option. This option takes the form of comma-separated column names, where each

column name is either a simple column family, or a columnfamily:qualifier. The special

column name HBASE_ROW_KEY is used to designate that this column should be used

as the row key for each imported record. You must specify exactly one column

to be the row key, and you must specify a column name for every column that exists in the

input data. Another special columnHBASE_TS_KEY designates that this column should be

used as timestamp for each record. Unlike HBASE_ROW_KEY, HBASE_TS_KEY is optional.

You must specify at most one column as timestamp key for each imported record.

Record with invalid timestamps (blank, non-numeric) will be treated as bad record.

Note: if you use this option, then 'importtsv.timestamp' option will be ignored.

注意: ImportTsv導入的內容,phoenix看不到。事實上,hbase創建的表,Phoenix看不到。phoenix創建的表,hbase能看到,但是內容是編碼后的內容。



importtsv 工具默認使用hbase put api導數據.當使用選項 -Dimporttsv.bulk.output時,將會先生成HFILE文件的內部格式的文件。

The importtsv tool, by default, uses the HBase Put API to insert data into the HBase 

table using TableOutputFormat in its map phase. But when the -Dimporttsv.bulk.output option is specified, it instead generates HBase internal format (HFile) files on HDFS

by using HFileOutputFormat. Therefore, we can then use the completebulkload tool to load the generated files into a running cluster. The following steps are to use the bulk output and load tools:


生成HFILE格式的文件命令

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.bulk.output=/hfiles_tsv -Dimporttsv.columns=HBASE_ROW_KEY,cf:a test /sample1.csv

注: 生成hfile格式的文件,存放于hdfs中的/hfile_tsv目錄中,目錄會由命令自己創建。

[hadoop-user@rhel work]$ hdfs dfs -ls /hfiles_tsv/cf

18/06/28 10:49:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Found 1 items

-rw-r--r--   1 hadoop-user supergroup       5125 2018-06-28 10:40 /hfiles_tsv/cf/0e466616d42a4a128fb60caa7dbe075a

注: 0e466616d42a4a128fb60caa7dbe075a的命名格式跟WEB中region的命名格式很像

通過

hadoop jar hbase-server-2.0.0.jar completebulkload /hfiles_tsv 'test'

出現異常; Exception in thread "main" java.lang.ClassNotFoundException: completebulkload

HBASE文檔中兩種導入方式:

There are two ways to invoke this utility, with explicit classname and via the driver:

Explicit Classname

$ bin/hbase org.apache.hadoop.hbase.tool.LoadIncrementalHFiles <hdfs://storefileoutput> <tablename>

Driver

HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar completebulkload <hdfs://storefileoutput> <tablename>






向AI問一下細節

免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。

AI

漯河市| 万山特区| 东安县| 蒙城县| 建湖县| 黑河市| 河西区| 台中市| 安阳县| 磐安县| 浪卡子县| 团风县| 崇文区| 潜山县| 墨竹工卡县| 六枝特区| 温泉县| 临猗县| 泉州市| 福建省| 济阳县| 宁国市| 广水市| 松滋市| 特克斯县| 外汇| 高雄市| 昭苏县| 淮北市| 涿州市| 葵青区| 吴川市| 漳浦县| 芷江| 礼泉县| 平乐县| 延寿县| 留坝县| 宁化县| 图木舒克市| 泸西县|