您好,登錄后才能下訂單哦!
前言:
文中未標明時,所有陪著之都是在nagios_server上的配置!
配置流程:
nagios.cfg-->hosts.cfg-->services.cfg-->command.cfg
創建hosts.cfg文件來定義主機和主機組
創建services.cfg文件來定義服務
用默認的contacts.cfg文件來定義聯系人和聯系人組
用默認的commands.cfg文件來定義命令
用默認的timeperiods.cfg來定義監控時間段
用默認的templates.cfg文件作為資源引用文件
/usr/local/nagios/etc/ 目錄結構
[root@chboc etc]# tree /usr/local/nagios/etc
/usr/local/nagios/etc
|-- cgi.cfg
|-- htpasswd.users
|-- nagios.cfg
|-- nagios.cfg.bak
|-- nrpe.cfg
|-- objects
| |-- commands.cfg
| |-- contacts.cfg
| |-- hosts.cfg 定義監控remote_hosts和remote_hosts_group
| |-- hosts.cfg.bak
| |-- localhost.cfg
| |-- printer.cfg
| |-- services.cfg 定義被動模式的監控服務,監控remote_linux的本地資源
| |-- switch.cfg
| |-- templates.cfg
| |-- timeperiods.cfg
| `-- windows.cfg
|-- resource.cfg
`-- services 定義主動模式的監控服務,監控remote_linux的對外提供的服務
`-- web.cfg
注意!!創建hosts.cfg,services.cfg,services文件和目錄時,修改他們的屬主和屬組!!
nagios.cfg
[root@chboc etc]# diff nagios.cfg nagios.cfg.bak
34,35d33
< cfg_file=/usr/local/nagios/etc/objects/services.cfg
< cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
38c36
< #cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
---
> cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
54c52
< cfg_dir=/usr/local/nagios/etc/services
---
> #cfg_dir=/usr/local/nagios/etc/servers
可簡單理解為:nagios.cfg只是指定nagios啟動時加載哪些目錄和文件而已。
修改nagios.cfg主配置文件,添加services.cfg,hosts.cfg,使得nagios啟動過程中自動加載hosts.cfg和services.cfg中的內容。
nagios監控一般都是用來監控remote_linux上面的服務,所以在此將cfg_file=/usr/local/nagios/etc/objects/localhost.cfg注釋掉。
nagios的監控模式簡單分為主動模式和被動模式(NRPE)。
主動模式:一般用來監控web服務,數據庫等這些對外提供服務的監控,如:httpd,mysqld,sshd等
被動模式:一般用來監控本地資源,例如負載,內存,硬盤,虛擬內存,磁盤IO,溫度,風扇等的監控 (我們也可以通過snmp實現監控部分系統資源)。
主動模式和被動模式是可以相互互換的,沒有絕對性。
hosts.cfg
[root@chboc objects]# egrep -v "^$|^#" hosts.cfg
define host{
use linux-server
host_name lnmp
alias 198-lnmp
address 192.168.1.198
}
define host{
use linux-server
host_name lamp
alias 199-lamp
address 192.168.1.199
}
define hostgroup{
hostgroup_name linux-servers ; The name of the hostgroup
alias Linux Servers ; Long name of the group
members lnmp,lamp ; Comma separated list of hosts that belong to this group
}
定義所要監控的remote_linux,并將其分組。
生成hosts.cfg文件:
head -51 localhost.cfg >hosts.cfg
chown nagios.nagios /usr/local/nagios/etc/objects/hosts.cfg
services.cfg -->command.cfg
|
+--->nrpe.cfg(remote_linux)
[root@chboc objects]# egrep -v "^$|^#" services.cfg
define service {
use generic-service
host_name lnmp
service_description Disk Partition
check_command check_nrpe!check_disk
}
define service {
use generic-service
host_name lnmp
service_description load
check_command check_nrpe!check_load
}
define service {
use generic-service
host_name lnmp
service_description mem
check_command check_nrpe!check_mem
}
define service {
use generic-service
host_name lnmp
service_description swap
check_command check_nrpe!check_swap
}
define service {
use generic-service
host_name lnmp
service_description iostat
check_command check_nrpe!check_iostat
}
services.cfg 我采用NRPE的被動模式,通過nagios_server主機上的check_nrpe插件,調用運行在renmote_linux上的NRPE daemon監控renmote_linux的本地資源。
NRPE原理
NRPE 總共由兩部分組成:
check_nrpe 插件,位于監控主機上
NRPE daemon,運行在遠程的Linux主機上(通常就是被監控機)
按照上圖,整個的監控過程如下:
當Nagios 需要監控某個遠程Linux 主機的服務或者資源情況時:
Nagios 會運行check_nrpe 這個插件,告訴它要檢查什么;
check_nrpe 插件會連接到遠程的NRPE daemon,所用的方式是SSL;
NRPE daemon 會運行相應的Nagios 插件來執行檢查;
NRPE daemon 將檢查的結果返回給check_nrpe 插件,插件將其遞交給nagios做處理。
注意:NRPE daemon 需要Nagios 插件安裝在遠程的Linux主機上,否則,daemon不能做任何的監控
command.cfg
--------------------------------------------------------------------------
[root@chboc objects]# tail -4 commands.cfg
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
nrpe.cfg(remote_linux)
------------------------------------------------------------------------log_facility=daemon
pid_file=/var/run/nrpe.pid
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=127.0.0.1,192.168.1.201
dont_blame_nrpe=0
debug=0
command_timeout=60
connection_timeout=300
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,6 -c 30,25,20
command[check_mem]=/usr/local/nagios/libexec/check_memory.pl -w 6% -c 3%
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 8% -p /
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
command[check_iostat]=/usr/local/nagios/libexec/check_iostat -w 6 -c 10
command[check_port_80]=/usr/local/nagios/libexec/check_tcp -H localhost -p80
/usr/local/nagios/etc/server/web.cfg
[root@chboc etc]# egrep -v "^$|^#" services/web.cfg
define service{
use generic-service
host_name lnmp
service_description blog_url
check_command check_http!-I 192.168.1.198
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
define service{
use generic-service
host_name lnmp
service_description blog_port80
check_command check_tcp!80
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
define service{
use generic-service
host_name lnmp
service_description mysqld_port3306
check_command check_tcp!3306
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
define service{
use generic-service
host_name lnmp
service_description blog_port_80_beidong
check_command check_nrpe!check_port_80
max_check_attempts 3
normal_check_interval 2
retry_check_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
}
主動模式不需要調用check_nrpe插件,直接使用command.cfg里定義的命令即可!
檢查配置文件
修改/etc/init.d/nagios啟動文件,使其檢測時顯示詳細內容。
vim /etc/init.d/nagios +178
checkconfig)
printf "Running configuration check..."
$NagiosBin -v $NagiosCfgFile > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo " OK."
else
echo " CONFIG ERROR! Check your Nagios configuration."
exit 1
fi
例如:
我們將command.cfg定義的check_nrpe命令注釋掉:
[root@chboc objects]# tail -4 commands.cfg
#define command{
# command_name check_nrpe
# command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
# }
進行檢測:
[root@chboc objects]# /etc/init.d/nagios checkconfig
Running configuration check...
Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL
Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/services.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/hosts.cfg'...
Processing object config directory '/usr/local/nagios/etc/services'...
Processing object config file '/usr/local/nagios/etc/services/web.cfg'...
Read object config files okay...
Running pre-flight check on configuration data...
Checking services...
Error: Service check command 'check_nrpe' specified in service 'Disk Partition' for host 'lnmp' not defined anywhere!
Error: Service check command 'check_nrpe' specified in service 'blog_port_80_beidong' for host 'lnmp' not defined anywhere!
Error: Service check command 'check_nrpe' specified in service 'iostat' for host 'lnmp' not defined anywhere!
Error: Service check command 'check_nrpe' specified in service 'load' for host 'lnmp' not defined anywhere!
Error: Service check command 'check_nrpe' specified in service 'mem' for host 'lnmp' not defined anywhere!
Error: Service check command 'check_nrpe' specified in service 'swap' for host 'lnmp' not defined anywhere!
Checked 9 services.
Checking hosts...
Warning: Host 'lamp' has no services associated with it!
Checked 2 hosts.
Checking host groups...
Checked 1 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 1 contacts.
Checking contact groups...
Checked 1 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 24 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 1
Total Errors: 6
***> One or more problems was encountered while running the pre-flight check...
Check your configuration file(s) to ensure that they contain valid
directives and data defintions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.
CONFIG ERROR! Check your Nagios configuration.
在command.cfg添加check_nrpe命令的定義并檢測:
[root@chboc objects]# /etc/init.d/nagios checkconfig
Running configuration check...
Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL
Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/services.cfg'...
Processing object config file '/usr/local/nagios/etc/objects/hosts.cfg'...
Processing object config directory '/usr/local/nagios/etc/services'...
Processing object config file '/usr/local/nagios/etc/services/web.cfg'...
Read object config files okay...
Running pre-flight check on configuration data...
Checking services...
Checked 10 services.
Checking hosts...
Checked 2 hosts.
Checking host groups...
Checked 1 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 1 contacts.
Checking contact groups...
Checked 1 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 25 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
OK.
我們可以注意到檢測的都是nagios.cfg中指定的文件。
/usr/local/nagios/libexex/
nagios_server對remote_linux的監控主要是通過/usr/local/nagios/libexex/下的腳本進行的。因此測試的時候可以先以此作為測試,這個是第一步,這個不ok,那肯定不會有監控結果的。
被動模式:
[root@chboc libexec]# ./check_nrpe -H 192.168.1.198 -c check_port_80
TCP OK - 0.000 second response time on port 80|time=0.000166s;;;0.000000;10.000000
[root@chboc libexec]# ./check_nrpe -H 192.168.1.198 -c check_load
OK - load average: 0.05, 0.01, 0.00|load1=0.050;15.000;30.000;0; load5=0.010;10.000;25.000;0; load15=0.000;6.000;20.000;0;
[root@chboc libexec]# ./check_nrpe -H 192.168.1.198 -c check_iostat
IOSTAT OK - user 0.04 nice 0.00 sys 0.19 iowait 0.18 idle 0.00 | iowait=0.18%;; idle=0.00%;; user=0.04%;; nice=0.00%;; sys=0.19%;;
[root@chboc libexec]# ./check_nrpe -H 192.168.1.198 -c check_disk
DISK OK - free space: / 4702 MB (57% inode=81%);| /=3437MB;6860;7889;0;8575
[root@chboc libexec]# ./check_nrpe -H 192.168.1.198 -c check_mem
CHECK_MEMORY OK - 847M free | free=888504320b;62198906.88:;31099453.44:
[root@chboc libexec]# ./check_nrpe -H 192.168.1.198 -c check_swap
SWAP OK - 100% free (1023 MB out of 1023 MB) |swap=1023MB;204;102;0;1023
主動模式
[root@chboc libexec]# ./check_tcp -H 192.168.1.198 -p3306
TCP OK - 0.000 second response time on port 3306|time=0.000478s;;;0.000000;10.000000
注解:
可以通過 --help查看腳本使用方法。
[root@chboc libexec]# ./check_http --help
......
Usage:
check_http -H <vhost> | -I <IP-address> [-u <uri>] [-p <port>]
[-w <warn time>] [-c <critical time>] [-t <timeout>] [-L] [-a auth]
[-b proxy_auth] [-f <ok|warning|critcal|follow|sticky|stickyport>]
[-e <expect>] [-s string] [-l] [-r <regex> | -R <case-insensitive regex>]
[-P string] [-m <min_pg_size>:<max_pg_size>] [-4|-6] [-N] [-M <age>]
[-A string] [-k string] [-S <version>] [--sni] [-C <warn_age>[,<crit_age>]]
[-T <content-type>] [-j method]
NOTE: One or both of -H and -I must be specified
Options:
-h, --help
Print detailed help screen
-V, --version
Print version information
-H, --hostname=ADDRESS
Host name argument for servers using host headers (virtual host)
Append a port to include it in the header (eg: example.com:5000)
-I, --IP-address=ADDRESS
IP address or name (use numeric address if possible to bypass DNS lookup).
-p, --port=INTEGER
Port number (default: 80)
-4, --use-ipv4
Use IPv4 connection
-6, --use-ipv6
Use IPv6 connection
-S, --ssl=VERSION
Connect via SSL. Port defaults to 443. VERSION is optional, and prevents
auto-negotiation (1 = TLSv1, 2 = SSLv2, 3 = SSLv3).
--sni
Enable SSL/TLS hostname extension support (SNI)
-C, --certificate=INTEGER
Minimum number of days a certificate has to be valid. Port defaults to 443
(when this option is used the URL is not checked.)
-e, --expect=STRING
Comma-delimited list of strings, at least one of them is expected in
the first (status) line of the server response (default: HTTP/1.)
If specified skips all other status line logic (ex: 3xx, 4xx, 5xx processing)
-s, --string=STRING
String to expect in the content
-u, --url=PATH
URL to GET or POST (default: /)
-P, --post=STRING
URL encoded http POST data
-j, --method=STRING (for example: HEAD, OPTIONS, TRACE, PUT, DELETE)
Set HTTP method.
-N, --no-body
Don't wait for document body: stop reading after headers.
(Note that this still does an HTTP GET or POST, not a HEAD.)
-M, --max-age=SECONDS
Warn if document is more than SECONDS old. the number can also be of
the form "10m" for minutes, "10h" for hours, or "10d" for days.
-T, --content-type=STRING
specify Content-Type header media type when POSTing
-l, --linespan
Allow regex to span newlines (must precede -r or -R)
-r, --regex, --ereg=STRING
Search page for regex STRING
-R, --eregi=STRING
Search page for case-insensitive regex STRING
--invert-regex
Return CRITICAL if found, OK if not
-a, --authorization=AUTH_PAIR
Username:password on sites with basic authentication
-b, --proxy-authorization=AUTH_PAIR
Username:password on proxy-servers with basic authentication
-A, --useragent=STRING
String to be sent in http header as "User Agent"
-k, --header=STRING
Any other tags to be sent in http header. Use multiple times for additional headers
-L, --link
Wrap output in HTML link (obsoleted by urlize)
-f, --onredirect=<ok|warning|critical|follow|sticky|stickyport>
How to handle redirected pages. sticky is like follow but stick to the
specified IP address. stickyport also ensures port stays the same.
-m, --pagesize=INTEGER<:INTEGER>
Minimum page size required (bytes) : Maximum page size required (bytes)
-w, --warning=DOUBLE
Response time to result in warning status (seconds)
-c, --critical=DOUBLE
Response time to result in critical status (seconds)
-t, --timeout=INTEGER
Seconds before connection times out (default: 10)
-v, --verbose
Show details for command-line debugging (Nagios may truncate output)
......
Examples:
CHECK CONTENT: check_http -w 5 -c 10 --ssl -H www.verisign.com
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。