您好,登錄后才能下訂單哦!
Oracle 19C Data Guard 基礎運維 -0 5Failovers (GAP)
原主庫 |
原備庫 |
Failovers |
新主庫 |
獨立庫 |
192.168.31.90 |
192.168.31.100 |
192.168.31.100 |
192.168.31.90 | |
cjcdb |
chendb |
chendb |
cjcdb |
Failover :
https://docs.oracle.com/en/database/oracle/oracle-database/19/sbydb/data-guard-concepts-and-administration.pdf
Figure 9-4 Failover to a Standby Database
Performing a Failover to a Physical Standby Database
關于archive gap 的問題?
上一篇博客《04 Failovers疑問?》寫了關于 archive gap的疑問,在實驗中,我提前將備庫關機,主庫端插入大量數據產生 3個歸檔文件,并手動將最后 3個歸檔文件重命名,目的是不讓備庫獲取到這三個歸檔文件,在啟動備庫,試圖模擬出備庫 archive gap場景,但是在備庫端 v$archive_gap中顯示空的,備庫沒有檢測出 archive gap的存在嗎?
實際上是本人對archive gap概念存在一些誤解,比如主庫有 1到 100個歸檔,我認為只要有任何歸檔文件在備庫端獲取失敗都會出現 archive gap,都會記錄到 v$archive_gap,通過上一篇實驗發現這種理論顯然是不對的,我強制將主庫 98,99,100三個歸檔文件重命名,備庫端并沒有出現 archive gap,即在 v$archive_gap中不會有數據。
那么究竟什么場景才會出現archive gap?真實的場景是,備庫在接收主庫歸檔文件時有部分沒有接收成功,但后續的歸檔文件又接收成功了,比如主庫 1到 100個歸檔文件,出于某種原因,備庫沒有接收到 97,98兩個歸檔,但是后面的 99,100歸檔又能正常接收,這時就會產生 archive gap,在 v$archive_gap會查到 97,98歸檔信息。 (感謝墨天輪平臺“你好我是李白”的答疑解惑 )
實驗過程如下:
場景二:archive gap下的failover
主庫模擬故障,模擬歸檔gap :
先停掉備庫: 不接收主庫產生的 redo 或歸檔數據
SQL> shutdown immediate
主庫:生成測試數據,生成redo 和歸檔數據
---session 1
SQL>
declare
begin
for i in 1 .. 1000 000 loop
insert into test1 values (i);
commit;
end loop;
end;
插入數據期間,生成了3 個歸檔文件
[oracle@cjcos01 arch]$ pwd
/arch
......
cjcpdb_arch_1_74_1030641846.arc
cjcpdb_arch_1_75_1030641846.arc
cjcpdb_arch_1_76_1030641846.arc
主庫重命名新產生的前兩個歸檔文件,模擬歸檔gap
[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_74_1030641846.arc cjcpdb_arch_1_74_1030641846.arc.bak
[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_75_1030641846.arc cjcpdb_arch_1_75_1030641846.arc.bak
再次插入部分數據
SQL>
declare
begin
for i in 1 .. 1000 0 loop
insert into test1 values (i);
commit;
end loop;
end;
啟動備庫:
SQL> startup
-- 備庫啟動時,查看對應主庫日志,提示找不到 74,75 兩個歸檔文件,無法將 74,75 發送到備庫端。
2020-04-19T18:37:53.170879+08:00
Errors in file /u01/app/oracle/diag/rdbms/cjcdb/cjcdb/trace/cjcdb_tt00_2349.trc:
ORA-00308: cannot open archived log '/arch/cjcpdb_arch_1_75_1030641846.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
2020-04-19T18:37:53.171203+08:00
Errors in file /u01/app/oracle/diag/rdbms/cjcdb/cjcdb/trace/cjcdb_tt00_2349.trc:
ORA-00308: cannot open archived log '/arch/cjcpdb_arch_1_74_1030641846.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
備庫:查看archive log ,實際應該是 74 到 75 ,不清楚為什么會顯示 73
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
1 73 75
備庫:沒有接收到74,75 兩個歸檔文件
主庫重命名system01.dbf 模擬數據庫故障
[oracle@cjcos01 arch]$ cd /u01/app/oracle/oradata/CJCDB/
[oracle@cjcos01 CJCDB]$ mv system01.dbf system01.dbf.bak
SQL> alter system checkpoint;
SQL> shutdown abort
主庫啟動失敗
SQL> startup
ORACLE instance started.
Total System Global Area 1375728192 bytes
Fixed Size 9134656 bytes
Variable Size 1107296256 bytes
Database Buffers 251658240 bytes
Redo Buffers 7639040 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/u01/app/oracle/oradata/CJCDB/system01.dbf'
SQL> select open_mode from v$database;
OPEN_MODE
--------------------
MOUNTED
備庫:
1. 檢查 dg 恢復模式 ( 最大性能模式 )
SQL> select database_role,protection_level,protection_mode from v$database;
DATABASE_ROLE PROTECTION_LEVEL PROTECTION_MODE
---------------- -------------------- --------------------
PHYSICAL STANDBY MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
2 檢查 archive_gap ,實際應該是74 到 75 ,不清楚為什么會顯示 73
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
1 73 75
主庫:
SQL> select name from v$archived_log where thread#=1 and sequence# between 73 and 75;
NAME
--------------------------------------------------------------------------------
/arch/cjcpdb_arch_1_73_1030641846.arc
/arch/cjcpdb_arch_1_74_1030641846.arc
/arch/cjcpdb_arch_1_75_1030641846.arc
將73 歸檔文件拷貝到備庫端
[oracle@cjcos01 arch]$ scp cjcpdb_arch_1_73_1030641846.arc cjcos02:/arch
主庫在mount 狀態下執行 flush redo 操作
SQL> ALTER SYSTEM FLUSH REDO TO chendb;
ALTER SYSTEM FLUSH REDO TO chendb
*
ERROR at line 1:
ORA-16416: No viable switchover targets available
備庫:手動注冊73 號歸檔,也顯示歸檔已經注冊了
SQL> alter database register logfile '/arch/cjcpdb_arch_1_73_1030641846.arc';
alter database register logfile '/arch/cjcpdb_arch_1_73_1030641846.arc'
*
ERROR at line 1:
ORA-16089: archive log has already been registered
但是archive gap 還是顯示有 73
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
1 73 75
主庫:將74 號歸檔文件名改回來
[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_74_1030641846.arc.bak cjcpdb_arch_1_74_1030641846.arc
再次執行flush redo
SQL> ALTER SYSTEM FLUSH REDO TO chendb;
ALTER SYSTEM FLUSH REDO TO chendb
*
ERROR at line 1:
ORA-16416: No viable switchover targets available
查看主庫日志,主庫已經將74 歸檔發生備庫端了,開始嘗試讀取 75 號歸檔文件。
如果flush redo 命令沒生效,也可以將歸檔文件拷到備庫端,手動執行注冊
SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';
備庫:只有1 個 75 號歸檔找不到了
SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
1 75 75
備庫:取消應用進程
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
備庫:由于存在archive gap ,是不允許常規的 failover
SQL> ALTER DATABASE FAILOVER TO chendb;
ALTER DATABASE FAILOVER TO chendb
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER...FINISH not allowed due to gap for thr 1, seq 75-75
備庫:加force 也不生效
SQL> ALTER DATABASE FAILOVER TO chendb force;
ALTER DATABASE FAILOVER TO chendb force
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER...FINISH not allowed due to gap for thr 1, seq 75-75
強制failover: 在存在 archive gap 情況下,強制執行 failover ,會丟失數據,正式環境謹慎使用!!!
Perform a data loss failover.
If an error condition cannot be resolved, a failover can still be performed (with some data loss) by issuing the following SQL statement on the target standby database:
SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
打開數據庫
SQL> ALTER DATABASE OPEN;
查看數據
SQL> select count(*) from test1;
COUNT(*)
----------
252780
test1 表丟失了 1000 000+1000- 252780 =748220 條數據。
歡迎關注我的微信公眾號"IT小Chen",共同學習,共同成長!!!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。