您好,登錄后才能下訂單哦!
這篇文章主要介紹HBase如何實現故障排除與修復,文中介紹的非常詳細,具有一定的參考價值,感興趣的小伙伴們一定要看完!
總是先從主服務器的日志開始。通常情況下,他總是一行一行的重復信息。如果不是這樣,說明有問題,可以Google或是用search-hadoop.com來搜索遇到的異常。
錯誤很少僅僅單獨出現在HBase中,通常是某一個地方出了問題,引起各處大量異常和調用棧跟蹤信息。遇到這樣的錯誤,最好的辦法是往上查日志,找到最初的異常。例如區域服務器會在退出的時候打印一些度量信息。Grep這個轉儲 應該可以找到最初的異常信息。
區域服務器的自殺是很“正常”的。當一些事情發生錯誤的,他們就會自殺。如果ulimit和xcievers沒有修改,HDFS將無法運轉正常,在HBase看來,HDFS死掉了。假想一下,你的MySQL突然無法訪問它的文件系統,他會怎么做。同樣的事情會發生在HBase和HDFS上。還有一個造成區域服務器切腹自殺的常見的原因是,他們執行了一個長時間的GC操作,這個時間超過了ZooKeeper的會話時長。
重要日志的位置( <user>是啟動服務的用戶,<hostname> 是機器的名字)
NameNode: $HADOOP_HOME/logs/hadoop-<user>-namenode-<hostname>.log
DataNode: $HADOOP_HOME/logs/hadoop-<user>-datanode-<hostname>.log
JobTracker: $HADOOP_HOME/logs/hadoop-<user>-jobtracker-<hostname>.log
TaskTracker: $HADOOP_HOME/logs/hadoop-<user>-jobtracker-<hostname>.log
HMaster: $HBASE_HOME/logs/hbase-<user>-master-<hostname>.log
RegionServer: $HBASE_HOME/logs/hbase-<user>-regionserver-<hostname>.log
ZooKeeper: TODO
啟用 RPC級別日志
Enabling the RPC-level logging on a RegionServer can often given insight on timings at the server. Once enabled, the amount of log spewed is voluminous. It is not recommended that you leave this logging on for more than short bursts of time. To enable RPC-level logging, browse to the RegionServer UI and click on Log Level. Set the log level to DEBUG for the package org.apache.hadoop.ipc (Thats right, for hadoop.ipc, NOT, hbase.ipc). Then tail the RegionServers log. Analyze.
To disable, set the logging level back to INFO level.
JVM 垃圾收集日志
HBase is memory intensive, and using the default GC you can see long pauses in all threads including the Juliet Pause aka "GC of Death". To help debug this or confirm this is happening GC logging can be turned on in the Java virtual machine.
To enable, in hbase-env.sh add:
export HBASE_OPTS="-XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/home/hadoop/hbase/logs/gc-hbase.log"
Adjust the log directory to wherever you log. Note: The GC log does NOT roll automatically, so you'll have to keep an eye on it so it doesn't fill up the disk.
At this point you should see logs like so:
64898.952: [GC [1 CMS-initial-mark: 2811538K(3055704K)] 2812179K(3061272K), 0.0007360 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 64898.953: [CMS-concurrent-mark-start] 64898.971: [GC 64898.971: [ParNew: 5567K->576K(5568K), 0.0101110 secs] 2817105K->2812715K(3061272K), 0.0102200 secs] [Times: user=0.07 sys=0.00, real=0.01 secs]
In this section, the first line indicates a 0.0007360 second pause for the CMS to initially mark. This pauses the entire VM, all threads for that period of time.
The third line indicates a "minor GC", which pauses the VM for 0.0101110 seconds - aka 10 milliseconds. It has reduced the "ParNew" from about 5.5m to 576k. Later on in this cycle we see:
64901.445: [CMS-concurrent-mark: 1.542/2.492 secs] [Times: user=10.49 sys=0.33, real=2.49 secs] 64901.445: [CMS-concurrent-preclean-start] 64901.453: [GC 64901.453: [ParNew: 5505K->573K(5568K), 0.0062440 secs] 2868746K->2864292K(3061272K), 0.0063360 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 64901.476: [GC 64901.476: [ParNew: 5563K->575K(5568K), 0.0072510 secs] 2869283K->2864837K(3061272K), 0.0073320 secs] [Times: user=0.05 sys=0.01, real=0.01 secs] 64901.500: [GC 64901.500: [ParNew: 5517K->573K(5568K), 0.0120390 secs] 2869780K->2865267K(3061272K), 0.0121150 secs] [Times: user=0.09 sys=0.00, real=0.01 secs] 64901.529: [GC 64901.529: [ParNew: 5507K->569K(5568K), 0.0086240 secs] 2870200K->2865742K(3061272K), 0.0087180 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 64901.554: [GC 64901.555: [ParNew: 5516K->575K(5568K), 0.0107130 secs] 2870689K->2866291K(3061272K), 0.0107820 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 64901.578: [CMS-concurrent-preclean: 0.070/0.133 secs] [Times: user=0.48 sys=0.01, real=0.14 secs] 64901.578: [CMS-concurrent-abortable-preclean-start] 64901.584: [GC 64901.584: [ParNew: 5504K->571K(5568K), 0.0087270 secs] 2871220K->2866830K(3061272K), 0.0088220 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 64901.609: [GC 64901.609: [ParNew: 5512K->569K(5568K), 0.0063370 secs] 2871771K->2867322K(3061272K), 0.0064230 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 64901.615: [CMS-concurrent-abortable-preclean: 0.007/0.037 secs] [Times: user=0.13 sys=0.00, real=0.03 secs] 64901.616: [GC[YG occupancy: 645 K (5568 K)]64901.616: [Rescan (parallel) , 0.0020210 secs]64901.618: [weak refs processing, 0.0027950 secs] [1 CMS-remark: 2866753K(3055704K)] 2867399K(3061272K), 0.0049380 secs] [Times: user=0.00 sys=0.01, real=0.01 secs] 64901.621: [CMS-concurrent-sweep-start]
The first line indicates that the CMS concurrent mark (finding garbage) has taken 2.4 seconds. But this is a concurrent 2.4 seconds, Java has not been paused at any point in time.
There are a few more minor GCs, then there is a pause at the 2nd last line:
64901.616: [GC[YG occupancy: 645 K (5568 K)]64901.616: [Rescan (parallel) , 0.0020210 secs]64901.618: [weak refs processing, 0.0027950 secs] [1 CMS-remark: 2866753K(3055704K)] 2867399K(3061272K), 0.0049380 secs] [Times: user=0.00 sys=0.01, real=0.01 secs]
The pause here is 0.0049380 seconds (aka 4.9 milliseconds) to 'remark' the heap.
At this point the sweep starts, and you can watch the heap size go down:
64901.637: [GC 64901.637: [ParNew: 5501K->569K(5568K), 0.0097350 secs] 2871958K->2867441K(3061272K), 0.0098370 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] ... lines removed ... 64904.936: [GC 64904.936: [ParNew: 5532K->568K(5568K), 0.0070720 secs] 1365024K->1360689K(3061272K), 0.0071930 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 64904.953: [CMS-concurrent-sweep: 2.030/3.332 secs] [Times: user=9.57 sys=0.26, real=3.33 secs]
At this point, the CMS sweep took 3.332 seconds, and heap went from about ~ 2.8 GB to 1.3 GB (approximate).
The key points here is to keep all these pauses low. CMS pauses are always low, but if your ParNew starts growing, you can see minor GC pauses approach 100ms, exceed 100ms and hit as high at 400ms.
This can be due to the size of the ParNew, which should be relatively small. If your ParNew is very large after running HBase for a while, in one example a ParNew was about 150MB, then you might have to constrain the size of ParNew (The larger it is, the longer the collections take but if its too small, objects are promoted to old gen too quickly). In the below we constrain new gen size to 64m.
Add this to HBASE_OPTS:
export HBASE_OPTS="-XX:NewSize=64m -XX:MaxNewSize=64m <cms options from above> <gc logging options from above>"
jstack 是一個最重要(除了看Log)的java工具,可以看到具體的Java進程的在做什么。可以先用Jps看到進程的Id,然后就可以用jstack。他會按線程的創建順序顯示線程的列表,還有這個線程在做什么。
當從客戶端到RegionServer的RPC請求超時。例如如果Scan.setCacheing的值設置為500,RPC請求就要去獲取500行的數據,每500次.next()操作獲取一次。因為數據是以大塊的形式傳到客戶端的,就可能造成超時。將這個 serCacheing的值調小是一個解決辦法,但是這個值要是設的太小就會影響性能。
([由 GSSException 引起: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]) There can be several causes that produce this symptom.
First, check that you have a valid Kerberos ticket. One is required in order to set up communication with a secure Apache HBase cluster. Examine the ticket currently in the credential cache, if any, by running the klist command line utility. If no ticket is listed, you must obtain a ticket by running the kinit command with either a keytab specified, or by interactively entering a password for the desired principal.
Then, consult the Java Security Guide troubleshooting section. The most common problem addressed there is resolved by setting javax.security.auth.useSubjectCredsOnly system property value to false.
Because of a change in the format in which MIT Kerberos writes its credentials cache, there is a bug in the Oracle JDK 6 Update 26 and earlier that causes Java to be unable to read the Kerberos credentials cache created by versions of MIT Kerberos 1.8.1 or higher. If you have this problematic combination of components in your environment, to work around this problem, first log in with kinit and then immediately refresh the credential cache with kinit -R. The refresh will rewrite the credential cache without the problematic formatting.
Finally, depending on your Kerberos configuration, you may need to install the Java Cryptography Extension, or JCE. Insure the JCE jars are on the classpath on both server and client systems.
You may also need to download the unlimited strength JCE policy files. Uncompress and extract the downloaded file, and install the policy jars into <java-home>/lib/security.
1.重新修復hbase meta表(根據hdfs上的regioninfo文件,生成meta表)
hbase hbck -fixMeta
2.重新將hbase meta表分給regionserver(根據meta表,將meta表上的region分給regionservere)
hbase hbck -fixAssignments
當出現漏洞
hbase hbck -fixHdfsHoles (新建一個region文件夾)
hbase hbck -fixMeta (根據regioninfo生成meta表)
hbase hbck -fixAssignments (分配region到regionserver上)
查看meta中的region
scan 'hbase:meta' , {LIMIT=>10,FILTER=>"PrefixFilter('INDEX_11')"}
在數據遷移的時候碰到兩個重復的region
b0c8f08ffd7a96219f748ef14d7ad4f8,73ab00eaa7bab7bc83f440549b9749a3
刪除兩個重復的region
delete 'hbase:meta','INDEX_11,4380_2431,1429757926776.b0c8f08ffd7a96219f748ef14d7ad4f8.','info:regioninfo' delete 'hbase:meta','INDEX_11,5479_0041431700000000040100004815E9,1429757926776.73ab00eaa7bab7bc83f440549b9749a3.','info:regioninfo'
刪除兩個重復的hdfs
/hbase/data/default/INDEX_11/b0c8f08ffd7a96219f748ef14d7ad4f8
/hbase/data/default/INDEX_11/73ab00eaa7bab7bc83f440549b9749a3
對應的重啟regionserver(只是為了刷新hmaster上匯報的RIS的狀態)
肯定會丟數據,把沒有上線的重復region上的數據丟失
(1)缺失hbase.version文件
加上選項 -fixVersionFile 解決
(2)如果一個region即不在META表中,又不在hdfs上面,但是在regionserver的online region集合中
加上選項 -fixAssignments 解決
(3)如果一個region在META表中,并且在regionserver的online region集合中,但是在hdfs上面沒有
加上選項 -fixAssignments -fixMeta 解決,( -fixAssignments告訴regionserver close region),( -fixMeta刪除META表中region的記錄 (4)如果一個region在META表中沒有記錄,沒有被regionserver服務,但是在hdfs上面有
加上選項 -fixMeta -fixAssignments 解決,( -fixAssignments 用于assign region),( -fixMeta用于在META表中添加region的記錄)
(5)如果一個region在META表中沒有記錄,在hdfs上面有,被regionserver服務了
加上選項 -fixMeta 解決,在META表中添加這個region的記錄,先undeploy region,后assign (6)如果一個region在META表中有記錄,但是在hdfs上面沒有,并且沒有被regionserver服務
加上選項 -fixMeta 解決,刪除META表中的記錄
(7)如果一個region在META表中有記錄,在hdfs上面也有,table不是disabled的,但是這個region沒有被服務
加上選項 -fixAssignments 解決,assign這個region
(8)如果一個region在META表中有記錄,在hdfs上面也有,table是disabled的,但是這個region被某個regionserver服務了
加上選項 -fixAssignments 解決,undeploy這個region
(9)如果一個region在META表中有記錄,在hdfs上面也有,table不是disabled的,但是這個region被多個regionserver服務了 加上選項 -fixAssignments 解決,通知所有regionserver close region,然后assign region
(10)如果一個region在META表中,在hdfs上面也有,也應該被服務,但是META表中記錄的regionserver和實際所在的regionserver不相符
加上選項 -fixAssignments 解決
(11)region holes
需要加上 -fixHdfsHoles ,創建一個新的空region,填補空洞,但是不assign 這個 region,也不在META表中添加這個region的相關信息
(12)region在hdfs上面沒有.regioninfo文件
-fixHdfsOrphans 解決
(13)region overlaps
需要加上 -fixHdfsOverlaps
說明:
(1)修復region holes時,-fixHdfsHoles 選項只是創建了一個新的空region,填補上了這個區間,還需要加上-fixAssignments -fixMeta 來解決問題,( -fixAssignments 用于assign region),( -fixMeta用于在META表中添加region的記錄),所以有了組合拳 -repairHoles 修復region holes,相當于-fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans
(2) -fixAssignments,用于修復region沒有assign、不應該assign、assign了多次的問題
(3)-fixMeta,如果hdfs上面沒有,那么從META表中刪除相應的記錄,如果hdfs上面有,在META表中添加上相應的記錄信息
(4)-repair 打開所有的修復選項,相當于-fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps
新版本的hbck從(1)hdfs目錄(2)META(3)RegionServer這三處獲得region的Table和Region的相關信息,根據這些信息判斷并repair
以上是“HBase如何實現故障排除與修復”這篇文章的所有內容,感謝各位的閱讀!希望分享的內容對大家有幫助,更多相關知識,歡迎關注億速云行業資訊頻道!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。