您好,登錄后才能下訂單哦!
這篇文章主要介紹“oracle rac的lmd進程怎么理解”,在日常操作中,相信很多人在oracle rac的lmd進程怎么理解問題上存在疑惑,小編查閱了各式資料,整理出簡單好用的操作方法,希望對大家解答”oracle rac的lmd進程怎么理解”的疑惑有所幫助!接下來,請跟著小編一起來學習吧!
1,測試環境為oracle 10.2.0.1 rac
2,lmd進程如果異常中斷,會導致所屬RAC實例重啟,并且在關庫前會生成一個SYSTEMSTATE DUMP文件
3,lmon進程是監控lmd進程,即lmd進程如果死掉,會由lmon進程重啟它
4,lmd進程負責全局隊列服務,即GES,說白了,就是管理跨RAC多實例的資源請求,由此可見LMD進程的重要性,如果LMD出現故障,數據庫DML操作會HANG住
進而會引發RAC節點間的IPC通訊延時
5,IPC通訊延時會產生對應的LMD的TRACE FILE
--lmd含義
lmd進程是負責全局隊列服務的進程,即GES;
它是負責每個RAC實例來自遠端RAC節點的資源請求;并且它是一個DAEMON進程,也就是說會由一個監控進程保護它,如果它不存在,由監控進程重啟它
--可見lmd進程如果異常中斷,會直接導致RAC節點強制關閉,并且在關閉實例前生成一個systemstate dump,以供分析
[oracle@jingfa1 ~]$ ps -ef|grep lmd
oracle 4774 1 0 Nov09 ? 00:00:31 asm_lmd0_+ASM1
oracle 11220 1 0 02:13 ? 00:00:15 ora_lmd0_jingfa1
oracle 30706 30376 0 05:19 pts/3 00:00:00 grep lmd
[oracle@jingfa1 ~]$ kill -9 11220
Tue Nov 10 05:20:03 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_pmon_11212.trc:
ORA-00482: LMD* process terminated with error
Tue Nov 10 05:20:03 2015
PMON: terminating instance due to error 482
Tue Nov 10 05:20:03 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_lms0_11222.trc:
ORA-00482: LMD* process terminated with error
Tue Nov 10 05:20:03 2015
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/admin/jingfa/bdump/jingfa1_diag_11214.trc
Tue Nov 10 05:20:03 2015
Trace dumping is performing id=[cdmp_20151110052003]
Tue Nov 10 05:20:08 2015
Instance terminated by PMON, pid = 11212
--緊接實例又會自動重啟
Tue Nov 10 05:21:05 2015
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
可見lmd進程又會自動重啟
[oracle@jingfa1 ~]$ ps -ef|grep lmd
oracle 3474 30376 0 05:23 pts/3 00:00:00 grep lmd
oracle 4774 1 0 Nov09 ? 00:00:31 asm_lmd0_+ASM1
oracle 32703 1 0 05:21 ? 00:00:00 ora_lmd0_jingfa1
上述說lmd進程的健康是由其監控進程負責的,經查官方手冊是lmon進程,LMON進程負責每個RAC實例跨實例或者叫全局隊列及資源的管理,以及全局隊列鎖的恢復操作
[oracle@jingfa1 bdump]$ ps -ef|grep lmon
oracle 4772 1 0 Nov09 ? 00:00:29 asm_lmon_+ASM1
oracle 19857 30376 0 05:34 pts/3 00:00:00 grep lmon
oracle 32701 1 0 05:21 ? 00:00:02 ora_lmon_jingfa1
[oracle@jingfa1 bdump]$ kill -9 32701
可見如果異常中斷LMON,其所屬的LMD進程也會強制關閉
[oracle@jingfa1 bdump]$ ps -ef|grep lmd
oracle 4774 1 0 Nov09 ? 00:00:32 asm_lmd0_+ASM1
oracle 21171 30376 0 05:34 pts/3 00:00:00 grep lmd
可見只要異常中斷lmon進程,會強制重啟數據庫實例
Tue Nov 10 05:34:18 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_pmon_32695.trc:
ORA-00481: LMON process terminated with error
Tue Nov 10 05:34:18 2015
PMON: terminating instance due to error 481
Tue Nov 10 05:34:18 2015
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/admin/jingfa/bdump/jingfa1_diag_32697.trc
Tue Nov 10 05:34:18 2015
Trace dumping is performing id=[cdmp_20151110053418]
Tue Nov 10 05:34:23 2015
Instance terminated by PMON, pid = 32695
Tue Nov 10 05:35:19 2015
Starting ORACLE instance (normal)
可見lmon及lmd會自動重啟
[oracle@jingfa1 bdump]$ ps -ef|grep lmon
oracle 4772 1 0 Nov09 ? 00:00:30 asm_lmon_+ASM1
oracle 21820 1 0 05:35 ? 00:00:01 ora_lmon_jingfa1
oracle 27926 30376 0 05:39 pts/3 00:00:00 grep lmon
[oracle@jingfa1 bdump]$ ps -ef|grep lmd
oracle 4774 1 0 Nov09 ? 00:00:33 asm_lmd0_+ASM1
oracle 21822 1 0 05:35 ? 00:00:00 ora_lmd0_jingfa1
oracle 28028 30376 0 05:39 pts/3 00:00:00 grep lmd
引申下,也就是說肯定操作系統層面會有某種機制,確保lmon及lmd進程異常中斷后,會重啟它們,哪這種機制到底是什么呢?
經分析操作系統層面的各個進程,主要是/etc/init.d下,對比后發現lmon及其所屬lmd是隸屬于ORACLE層面,而非集群層面,沒有對應的進程控制它們,
我們換個思路分析,與lmd進程相關的參數有哪些,其含義是什么?
NAME_1 VALUE_1 DESC1
-------------------------------------------------- -------------------------------------------------- --------------------------------------------------
_lm_lmd_waittime 8 default wait time for lmd in centiseconds
---node1
SQL> select addr,program,username,pid,spid from v$process where username='oracle' and spid=21822;
ADDR PROGRAM USERNAME PID SPID
---------------- ------------------------------------------------ --------------- ---------- ------------
0000000083A585C8 oracle@jingfa1 (LMD0) oracle 6 21822
--node2
SQL> select addr,program,username,pid,spid from v$process where username='oracle' and spid=668;
ADDR PROGRAM USERNAME PID SPID
---------------- ------------------------------------------------ --------------- ---------- ------------
0000000083A585C8 oracle@jingfa2 (LMD0) oracle 6 668
--node2
SQL> conn tbs_zxy/system
Connected.
SQL> update t_lock set a=11 where a=1;
1 row updated.
--node1
SQL> update t_lock set a=1111 where a=1;
--hang住
可見上述參數并不直接與鎖的檢測有關喲,但是lmd是和全局鎖有關的
換個思路,如果oradebug 模擬暫停lmd,再產生全局鎖會如何呢
---node1
暫停lmd
SQL> oradebug setospid 21822
Oracle pid: 6, Unix process pid: 21822, image: oracle@jingfa1 (LMD0)
SQL> oradebug suspend
Statement processed.
Tue Nov 10 06:03:44 2015
Unix process pid: 21822, image: oracle@jingfa1 (LMD0) flash frozen
---node2
暫停lmd
SQL> oradebug setospid 668
Oracle pid: 6, Unix process pid: 668, image: oracle@jingfa2 (LMD0)
SQL> oradebug suspend
Statement processed.
Tue Nov 10 06:06:08 2015
Unix process pid: 668, image: oracle@jingfa2 (LMD0) flash frozen
---node2
SQL> update t_lock set a=11 where a=1;
1 row updated.
--node1
SQL> update t_lock set a=1111 where a=1;
--hang住
現在開始觀察節點1及節點2的告警日志
--node2
Tue Nov 10 06:09:42 2015
IPC Send timeout detected.Sender: ospid 682 --可見發送進程是SMON進程
Receiver: inst 1 binc 432326879 ospid 21822 --可見接受者是NODE1的LMD進程
Tue Nov 10 06:09:45 2015
IPC Send timeout to 0.0 inc 20 for msg type 12 from opid 12 --同上,接受者也是SMON進程
Tue Nov 10 06:09:45 2015
Communications reconfiguration: instance_number 1
Tue Nov 10 06:09:45 2015
IPC Send timeout detected.Sender: ospid 696 --可見是MMON進程為發送進程
Receiver: inst 1 binc 432326879 ospid 21822 --可見接受進程是節點的lmd進程
Tue Nov 10 06:09:48 2015
IPC Send timeout to 0.0 inc 20 for msg type 12 from opid 15 ---同上,接受者為mmon發送進程
--node1
Tue Nov 10 06:09:23 2015
IPC Send timeout detected. Receiver ospid 21822 --可見接受為LMD進程
Tue Nov 10 06:09:23 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_lmd0_21822.trc: --產生一個LMD的TRACE文件
IPC Send timeout detected. Receiver ospid 21822 --同上
Tue Nov 10 06:09:27 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_lmd0_21822.trc:
由上可見lmd確實與全局鎖獲取相關,如果LMD進程出現故障,會導致RAC2個節點通訊出現問題
[oracle@jingfa2 bdump]$ ps -ef|grep 682
oracle 682 1 0 02:14 ? 00:00:01 ora_smon_jingfa2
oracle 7157 13004 0 06:15 pts/1 00:00:00 grep 682
SQL> select spid,pid,program from v$process where spid=696;
SPID PID PROGRAM
------------ ---------- ------------------------------------------------
696 15 oracle@jingfa2 (MMON)
到此,關于“oracle rac的lmd進程怎么理解”的學習就結束了,希望能夠解決大家的疑惑。理論與實踐的搭配能更好的幫助大家學習,快去試試吧!若想繼續學習更多相關知識,請繼續關注億速云網站,小編會繼續努力為大家帶來更多實用的文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。