您好,登錄后才能下訂單哦!
本篇內容主要講解“spark怎么連接使用hbase”,感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷,實用性強。下面就讓小編來帶大家學習“spark怎么連接使用hbase”吧!
一、環境準備
1、復制HBase目錄下的lib文件 到 spark目錄/lib/hbase。spark 依賴此lib
清單如下:guava-12.0.1.jar htrace-core-3.1.0-incubating.jar protobuf-java-2.5.0.jar 這三個jar加上以hbase開頭所有jar,其它就不必了。全部復制會引起報錯。
2、修改spark配置文件(spark-env.sh),在最后面增加一行
export SPARK_CLASSPATH=/usr/local/spark-1.5.1-bin-hadoop2.4/lib/hbase/*
3、重啟spark 集群
二、代碼
package com.xx; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.io.ImmutableBytesWritable; import org.apache.hadoop.hbase.mapreduce.TableInputFormat; import org.apache.hadoop.hbase.protobuf.ProtobufUtil; import org.apache.hadoop.hbase.protobuf.generated.ClientProtos; import org.apache.hadoop.hbase.util.Base64; import org.apache.hadoop.hbase.util.Bytes; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaPairRDD; import org.apache.spark.api.java.JavaSparkContext; import java.io.IOException; /** * spark 讀取HBase數據 * @author Chenj */ public class ReadHBase { private static final Log LOG = LogFactory.getLog(ErrorCount.class); private static final String appName = "hbase test"; private static final String master = "spark://192.168.1.21:7077"; public static void main(String[] avgs){ SparkConf conf = new SparkConf(). setAppName(appName). setMaster(master). setSparkHome(System.getenv("SPARK_HOME")). setJars(new String[]{System.getenv("jars")}); Configuration configuration = HBaseConfiguration.create(); configuration.set("hbase.zookeeper.property.clientPort", "2181"); //設置zookeeper client端口 configuration.set("hbase.zookeeper.quorum", "192.168.1.19"); // 設置zookeeper quorum configuration.addResource("/usr/local/hbase-1.0.1.1/conf/hbase-site.xml"); //將hbase的配置加載 configuration.set(TableInputFormat.INPUT_TABLE, "heartSocket"); JavaSparkContext sc = new JavaSparkContext(conf); Scan scan = new Scan(); scan.addFamily(Bytes.toBytes("d")); scan.addColumn(Bytes.toBytes("d"), Bytes.toBytes("consumeTime")); try { ClientProtos.Scan proto = ProtobufUtil.toScan(scan); String scanToString = Base64.encodeBytes(proto.toByteArray()); configuration.set(TableInputFormat.SCAN, scanToString); } catch (IOException e) { e.printStackTrace(); } JavaPairRDD<ImmutableBytesWritable, Result> rdd = sc.newAPIHadoopRDD(configuration, TableInputFormat.class, ImmutableBytesWritable.class, Result.class); LOG.info("總個數為:" + rdd.count()); } }
3、提交運行
./spark-submit --class com.xx.ReadHBase --master spark://ser21:7077 /usr/local/spark-1.0-SNAPSHOT.jar
到此,相信大家對“spark怎么連接使用hbase”有了更深的了解,不妨來實際操作一番吧!這里是億速云網站,更多相關內容可以進入相關頻道進行查詢,關注我們,繼續學習!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。