您好,登錄后才能下訂單哦!
基于生產環境 SQL Server 業務數據庫的特性,需要針對 SQL Server 生產環境幾組 AlwaysOn AG 的 Primary Replica 部署如下監控:
1. Windows集群節點不正常
2. 任一業務數據庫沒加入可用性組
3. AG中任一數據庫狀態異常
4. 任一數據庫延時超過閾值
1. primary replica 檢查Windows集群狀態不正常的節點數(>0 告警)
SELECT COUNT (*) as NotHealtyCount FROM sys.dm_hadr_cluster_members WHERE member_state_desc !='UP';
2. primary replica 檢查副本可用數據庫個數 與 需要加入AG的業務數據庫數 對比 (<后者告警)
SELECT COUNT(*) as DBCount FROM sys.availability_databases_cluster;
3. primary replica 檢查任一數據庫狀態異常(>0 告警)
SELECT COUNT (*) as NotHealtyCount FROM [master].[sys].[dm_hadr_database_replica_states] WHERE [synchronization_health_desc] != N'HEALTHY'
4. primary replica 檢查任一數據庫延時(>900 告警)
;WITH AG_Stats AS ( SELECT AR.replica_server_name, HARS.role_desc, Db_name(DRS.database_id) [DBName], DRS.last_commit_time FROM sys.dm_hadr_database_replica_states DRS INNER JOIN sys.availability_replicas AR ON DRS.replica_id = AR.replica_id INNER JOIN sys.dm_hadr_availability_replica_states HARS ON AR.group_id = HARS.group_id AND AR.replica_id = HARS.replica_id ), Pri_CommitTime AS ( SELECTreplica_server_name , DBName , last_commit_time FROMAG_Stats WHERErole_desc = 'PRIMARY' ), Sec_CommitTime AS ( SELECTreplica_server_name , DBName , last_commit_time FROMAG_Stats WHERErole_desc = 'SECONDARY' ) SELECT ISNULL(DATEDIFF(ss,s.last_commit_time,p.last_commit_time),0) AS [Sync_Lag_Secs] FROM Pri_CommitTime p LEFT JOIN Sec_CommitTime s ON [s].[DBName] = [p].[DBName]
Zabbix參數配置示例:
UserParameter=MSSQL.SyncHealth,"C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\110\Tools\Binn\SQLCMD.EXE" -d Master -U zabbix -P zabbix -h -1 -W -Q "SELECT COUNT (*) as NotHealtyCount
FROM [master].[sys].[dm_hadr_database_replica_states]
WHERE [synchronization_health_desc] != N''HEALTHY''
"
建議,將1、3、4項配置為模板,2項需要單獨配置4個參數。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。