您好,登錄后才能下訂單哦!
這篇文章主要介紹了oracle 11g rac 又一節點無法啟動的生產case怎么辦,具有一定借鑒價值,感興趣的朋友可以參考下,希望大家閱讀完這篇文章之后大有收獲,下面讓小編帶著大家一起了解一下。
一、環境描述
11g rac 雙節點,AIX小型機
二、現象
節點2無法啟動
crsctl start crs 執行報錯。
三、問題分析處理
1.查看數據庫日志
Archived Log entry 399348 added for thread 2 sequence 205493 ID 0xffffffff8452e669 dest 1: Sat Dec 09 11:13:47 2017 Thread 2 advanced to log sequence 205495 (LGWR switch) Current log# 3 seq# 205495 mem# 0: +DATA/orcl2/onlinelog/group_3.257.890091875 Sat Dec 09 11:13:51 2017 Archived Log entry 399349 added for thread 2 sequence 205494 ID 0xffffffff8452e669 dest 1: Sat Dec 09 11:24:07 2017 NOTE: ASMB terminating Errors in file /u01/app/oracle/diag/rdbms/orcl2/PTS22/trace/PTS22_asmb_8847608.trc: ORA-15064: ? ASM ?????? ORA-03113: ????????? Errors in file /u01/app/oracle/diag/rdbms/orcl2/PTS22/trace/PTS22_asmb_8847608.trc: ORA-15064: ? ASM ?????? ORA-03113: ????????? ASMB (ospid: 8847608): terminating the instance due to error 15064 Sat Dec 09 11:24:07 2017
--判斷可能是通信問題 orcldb2:/u01/app/oracle/diag/rdbms/orcl2/orcl22/trace$oerr ora 15064 15064, 00000, "communication failure with ASM instance" // *Cause: There was a failure to communicate with the ASM instance, most // likely because the connection went down. // *Action: Check the accompanying error messages for more information on the // reason for the failure. Note that database instances will always // return this error when the ASM instance is terminated abnormally.
2.查看集群日志
2017-12-09 11:23:51.026 [cssd(7667900)]CRS-1612:Network communication with node orcldb1 (1) missing for 50% of timeout interval. Removal of this node from cluster in 14.523 seconds 2017-12-09 11:23:59.039 [cssd(7667900)]CRS-1611:Network communication with node orcldb1 (1) missing for 75% of timeout interval. Removal of this node from cluster in 6.509 seconds 2017-12-09 11:24:03.052 [cssd(7667900)]CRS-1610:Network communication with node orcldb1 (1) missing for 90% of timeout interval. Removal of this node from cluster in 2.497 seconds 2017-12-09 11:24:05.552 [cssd(7667900)]CRS-1609:This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /u01/app/11.2.0/grid/log/orcldb2/cssd/ocssd.log. 2017-12-09 11:24:05.552 [cssd(7667900)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11.2.0/grid/log/orcldb2/cssd/ocssd.log 2017-12-09 11:24:05.614 [cssd(7667900)]CRS-1652:Starting clean up of CRSD resources.
3.查看系統日志
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION FE2DEE00 1209123617 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 1209122517 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 1209114417 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 1209114317 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET A924A5FC 1209112417 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
綜上所以的日志都指向數據庫通信可能有問題。
檢查心跳網絡,在節點一上ping 節點二是通的,ping自己當然也是通的。
這里感覺好奇怪,貌似心跳也沒問題啊。各種問好??????整理下思路,在節點二上ping 節點一,好嘛,真心ping不通。找到這個問題之后和客戶溝通,發現網絡剛剛做了調整導致的。經過網絡工程師的處理。心跳網絡恢復。輪到我上了,把集群給拉起來。
--root用戶執行 crsctl stop crs --報錯 crsctl stop crs -f 強制關閉 crsctl start crs crsctl stat res -t
感謝你能夠認真閱讀完這篇文章,希望小編分享的“oracle 11g rac 又一節點無法啟動的生產case怎么辦”這篇文章對大家有幫助,同時也希望大家多多支持億速云,關注億速云行業資訊頻道,更多相關知識等著你來學習!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。