您好,登錄后才能下訂單哦!
廢話不多說,本博文純屬于個人筆記,可能會出現雜亂無章的感覺,只是把遇到的問題一一的記錄下來,方便日后查看,也能幫助遇到類型問題的還在糾結的人。
系統版本及信息
cat /etc/redhat-release CentOS release 6.2 (Final) uname -a Linux 2.6.32-220.el6.x86_64 x86_64 x86_64 x86_64 GNU/Linux ifconfig |sed -n 1,2p eth0 Link encap:Ethernet HWaddr 40:F2:E9:29:5F:EA inet addr:192.168.0.2 Bcast:192.168.69.255 Mask:255.255.255.0 關閉 Iptables selinux
軟件版本信息
LAMP/LNMP 忽略,任何一個環境都可以,我這里是yum 安裝的LNMP環境 nagios-4.0.5.tar.gz nagios-plugins-1.4.16.tar.gz nrpe-2.15.tar.gz pnp4nagios-0.6.19.tar.gz
安裝Nagios軟件準備工作
確保 yum 能正常使用,建議是配置網絡 yum ,安裝系統所需庫文件 yum groupinstall "Compatibility libraries" "Base" "Development tools" 安裝lamp及所需包 yum -y install http* php* mysql* perl* net-snmp* openssl* glibc rrdtoolrrdtool-devel rrdtool-perl rrdtool-php chkconfig mysqld on chkconfig httpd on chkconfig snmpd on service httpd start service mysqld start service snmpd start 測試ok 繼續下一步 ps -ef | grep -v grep | grep http mysql snmp #分別查看,web頁面訪問測試
安裝Nagios
1、創建nagios程序用戶、組 [root@nagios ~]# useradd -s /sbin/nologin nagios [root@nagios ~]# mkdir /usr/local/nagios [root@nagios ~]# chown -R nagios.nagios /usr/local/nagios/ 2、編譯安裝nagios [root@nagios tools]# tar zxf nagios-4.0.5.tar.gz [root@nagios tools]# cd nagios-4.0.5 [root@nagios nagios-4.0.5]# ./configure --prefix=/usr/local/nagios [root@nagios nagios-4.0.5]# make all &&make install && make install-init && make install-commandmode&& make install-config && make install-webconf [root@nagios nagios-4.0.5]# echo $? 0 3、加入開機啟動 chkconfig --add nagios chkconfig nagios on chkconfig--list nagios
安裝nagios-plugins 插件
[root@nagios tools]# tar zxf nagios-plugins-1.4.16.tar.gz [root@nagios tools]# cd nagios-plugins-1.4.16 [root@nagios tools nagios-plugins-1.4.16]# ./configure --prefix=/usr/local/nagios/ [root@nagios tools nagios-plugins-1.4.16]# make [root@nagios tools nagios-plugins-1.4.16]# make install [root@nagios tools nagios-plugins-1.4.16]# echo $? 0
編輯http.conf配置文件
cd /etc/httpd/conf cp -a httpd.conf httpd.conf.bak vim httpd.conf # 添加在最后面即可 ####### setting for nagios ####### ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin" <Directory "/usr/local/nagios/sbin"> AuthType Basic Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "nagios access" AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user </Directory> Alias /nagios "/usr/local/nagios/share" <Directory "/usr/local/nagios/share"> AuthType Basic Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "nagios access" AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user </Directory> 修改 DirectoryIndex index.html index.html.var 為 DirectoryIndex index.php index.html index.html.var 修改 Options Indexes FollowSymLinks 為 Options FollowSymLinks #防止網站列目錄 service httpd restart 增加nagios登陸認證文件,一定要用默認的nagiosadmin作為用戶,否則需要修改其他文件,修改之前備份,這里就不備份了 [root@nagios etc]# cd /usr/local/nagios/etc [root@nagios etc]# sed -i s@nagiosadmin@nagiosadmin\,admin@g cgi.cfg [root@nagios etc]# sed -i s@\#default_user_name=guest@default_user_name=admin@g cgi.cfg [root@nagios nagios]# htpasswd -c /usr/local/nagios/etc/htpasswd admin New password: ****** Re-type new password:******
安裝 Nrpe 插件
[root@nagios tools]# tar zxf nrpe-2.15.tar.gz [root@nagios tools]# cd nrpe-2.15 [root@nagios nrpe-2.15]# ./configure;make all;make install-plugin;make install-daemon;make install-daemon-config 啟動Nrpe [root@nagios nrpe-2.15]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d [root@nagios nrpe-2.15]# netstat -antl |grep 5666 tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN [root@nagios libexec]#/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 NRPE v2.15 關閉Nrpe [root@nagios libexec]# ps -ef | grep -v grep | grep nrpe [root@nagios libexec]# kill -9 進程號
檢測nagios
[root@nagios etc]# /usr/local/nagios/bin/nagios-v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 表示OK
啟動nagios
[root@nagios etc]# service nagios start stop restart 開啟 停止 重啟 http://IP/nagios
安裝 pnp4nagios 插件
[root@nagios tools]# tar zxf pnp4nagios-0.6.19.tar.gz [root@nagios tools]# cd pnp4nagios-0.6.19 [root@nagios tools pnp4nagios-0.6.19]#./configure make all make install make install-config make install-init make install-webconf 創建默認配置文件 cd /usr/local/pnp4nagios/etc cp misccommands.cfg-sample misccommands.cfg cp nagios.cfg-sample nagios.cfg cp rra.cfg-sample rra.cfg cd pages cp web_traffic.cfg-sample web_traffic.cfg cd ../check_commands/ cp check_all_local_disks.cfg-samplecheck_all_local_disks.cfg cp check_nrpe.cfg-sample check_nrpe.cfg cp check_nwstat.cfg-sample check_nwstat.cfg cp /usr/local/pnp4nagios/libexec/* /usr/local/nagios/libexec/ vim /usr/local/nagios/etc/nagios.cfg 檢查 enable_environment_macros=1 process_performance_data=1 host_perfdata_command=process-host-perfdata service_perfdata_command=process-service-perfdata 提示:如果nagios版本是4.X,上面配置會導致后面,生成不了流量圖,報如下錯誤 PNP4Nagios Version 0.6.19 Please check the documentation for information about the following error. perfdata directory "/usr/local/pnp4nagios/var/perfdata/localhost" for host "localhost" does not exist.Read FAQ online file [line]: application/models/data.php [148]: back
出現這個錯誤的原因是參照
解決方案是使用 Bulk Mode方式 vim /usr/local/nagios/etc/nagios.cfg 檢查 enable_environment_macros=1 process_performance_data=1 添加到最后即可 # service performance data service_perfdata_file=/usr/local/pnp4nagios/var/service-perfdata service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$ service_perfdata_file_mode=a service_perfdata_file_processing_interval=15 service_perfdata_file_processing_command=process-service-perfdata-file # host performance data starting with Nagios 3.0 host_perfdata_file=/usr/local/pnp4nagios/var/host-perfdata host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$ host_perfdata_file_mode=a host_perfdata_file_processing_interval=15 host_perfdata_file_processing_command=process-host-perfdata-file 保存 vim /usr/local/nagios/etc/objects/commands.cfg define command{ command_name check_nrpe command_line $USER1$/check_nrpe-H $HOSTADDRESS$ -c $ARG1$ } #這一段放在上面即可 如下:同步模式設定方法添加到末尾就可以,記住在這個配置文件里面, 默認有這個配置,需要找到注釋掉,然后將下面的配置添加,如果不注釋掉,在你檢查nagios的配置文件的時候會報錯 define command{ command_name process-service-perfdata-file command_line /usr/local/pnp4nagios/libexec/process_perfdata.pl --bulk=/usr/local/pnp4nagios/var/service-perfdata } define command{ command_name process-host-perfdata-file command_line /usr/local/pnp4nagios/libexec/process_perfdata.pl --bulk=/usr/local/pnp4nagios/var/host-perfdata } 定義pnp的主機和服務兩個模版添加在最后面 vim /usr/local/nagios/etc/objects/templates.cfg define host { name host-pnp action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_ register 0 } define service { name service-pnp action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$ register 0 } 也可以添加在,其他參數下面省略了,下面這個方法可以減少很多配置主機啟用pnp時的時間 vim /usr/local/nagios/etc/objects/templates.cfg define host { action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_ } define service { action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$ }
先做一下pnp4nagios環境測試添加在httpd.conf最后面
vim /etc/httpd/conf/httpd.conf Alias /pnp4nagios "/usr/local/pnp4nagios/share" <Directory"/usr/local/pnp4nagios/share"> AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user </Directory> <IFModule mode_rewrite.c> RewirteEngine On Options FollowSymLinks RewirteBase /pnp4nagios RewirteRule ^(application|modules|system) -[F,L] RewirteCond %{REQUEST_FILENAME} !-f RewirteCond %{REQUEST_FILENAME} !-d RewirteRule .* index.php/$0 [PT,L] </IfModule> service httpd restart
訪問 http://IP/pnp4nagios
cd /usr/local/pnp4nagios/share/
mv install.php install.php.bak
編輯nagios.cfg文件
vim /usr/local/nagios/etc/nagios.cfg cfg_file=/usr/local/nagios/etc/objects/commands.cfg cfg_file=/usr/local/nagios/etc/objects/contacts.cfg cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg cfg_file=/usr/local/nagios/etc/objects/templates.cfg cfg_file=/usr/local/nagios/etc/objects/localhost.cfg cfg_file=/usr/local/nagios/etc/objects/hosts.cfg cfg_file=/usr/local/nagios/etc/objects/hostgroup.cfg cfg_file=/usr/local/nagios/etc/objects/services.cfg 或者 cfg_file=/usr/local/nagios/etc/objects/commands.cfg cfg_file=/usr/local/nagios/etc/objects/contacts.cfg cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg cfg_file=/usr/local/nagios/etc/objects/templates.cfg cfg_file=/usr/local/nagios/etc/objects/localhost.cfg cfg_dir=/usr/local/nagios/etc/objects/apps 提示:此操作只是啟用了linux主機監控,沒有啟用windows和switch,如果需要把注釋去掉即可,第一種和第二種都可以 區別是:第一種共同使用一個配置文件,第二種獨立使用配置文件,這里我都會演示,下面以第一種和第二種進行區分
添加主機配置,第一種方法
默認nagios/etc/objects/ 下面沒有 service.cfg host.cfg hostgroup.cfg 這幾個配置文件,需要手動添加 vim hosts.cfg define host{ use linux-server,host-pnp #這個是根據templates.cfg信息定義,如果上面定義的模板host-pnp添加在define host和define sevice里面,這兒host-pnp可以不用加,因為linux-server已經包含了 host_name cacti #必須是 被監控的主機名 alias cacti-web #別名隨便定義 address 192.168.0.3 #主機ip地址 contact_groups admins #郵件組,下面會演示 } define host{ use linux-server,host-pnp host_name nginx alias nginx-web address 192.168.0.4 contact_groups admins } 有多少機器就這樣添加多少臺 vim hostgroup.cfg define hostgroup{ hostgroup_name servers #組名 alias servers_group #別名 members cacti,nginx #主機名 多個 逗號 隔開 } vim service.cfg #所有主機在同一配置文件,很亂 #### set cacti host define service{ use local-service,services-pnp host_name cacti service_description http check_command check_http contact_groups admins flap_detection_enabled 0 } define service{ use local-service,services-pnp host_name cacti service_description SSH_port check_command check_tcp!22 contact_groups admins flap_detection_enabled 0 } define service{ use local-service,services-pnp host_name cacti service_description check_/ check_command check_nrpe!check_/ #使用nrpe檢測,客戶端需要定義 contact_groups admins flap_detection_enabled 0 } #### set nginx host define service{ use local-service,service-pnp host_name nginx service_description Check_free_mem check_command check_nrpe!check_free_mem contact_groups admins flap_detection_enabled 0 } define service{ use local-service,services-pnp host_name nginx service_description check_/ check_command check_nrpe!check_/ #使用nrpe檢測,客戶端需要定義 contact_groups admins flap_detection_enabled 0 } 有多少就需要添加多少,第一種方法 end
添加主機配置,第二種方法
cd nagios/etc/objects/ mkdir app cd app vim 192.168.0.2.cfg #在一個獨立的文件定義所有監控對象,這個沒有定義組,意義不大 ###定義host define host{ use linux-server,host-pnp #這個是根據templates.cfg信息定義,如果上面定義的模板host-pnp添加在define host和define sevice里面,這兒host-pnp可以不用加,因為linux-server已經包含了 host_name nginx #必須是 被監控的主機名 alias nginx-web #別名隨便定義 address 192.168.0.4 #主機ip地址 contact_groups admins #郵件組,下面會演示 } ###定義service define service{ use local-service,service-pnp host_name nginx service_description Check_free_mem check_command check_nrpe!check_free_mem contact_groups admins flap_detection_enabled 0 } define service{ use local-service,services-pnp host_name nginx service_description check_/ check_command check_nrpe!check_/ #使用nrpe檢測,客戶端需要定義 contact_groups admins flap_detection_enabled 0 }
vim 192.168.0.3.cfg ###定義host define host{ use linux-server,host-pnp #這個是根據templates.cfg信息定義,如果上面定義的模板host-pnp添加在define host和define sevice里面,這兒host-pnp可以不用加,因為linux-server已經包含了 host_name cacti #必須是 被監控的主機名 alias cacti-web #別名隨便定義 address 192.168.0.3 #主機ip地址 contact_groups admins #郵件組,下面會演示 } ###定義service define service{ use local-service,service-pnp host_name cacti service_description Check_free_mem check_command check_nrpe!check_free_mem contact_groups admins flap_detection_enabled 0 } define service{ use local-service,service-pnp host_name cacti service_description Check_free_mem check_command check_nrpe!check_free_mem contact_groups admins flap_detection_enabled 0 } 這種辦法比第一種方便許多,添加主機2種方法 END
nagios郵件報警設置
[root@nagios objects]# vim contacts.cfg #參數詳解,請百度 define contact{ contact_name nagiosadmin use generic-contact alias Nagios Admin service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,u,r service_notification_commands notify-service-by-email host_notification_commands notify-host-by-email email xxxx@163.com } define contactgroup{ contactgroup_name admins #這個就是上面那個admins alias Nagios Administrators members nagiosadmin }
檢查配置文件是否有錯
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check service nagios restart 服務端配置 end
客戶端安裝配置
需要安裝net-snmp,如果有其他錯誤根據提示進行解決 yum -y install net-snmp* 1、創建nagios程序用戶、組 [root@nagios ~]# useradd -s /sbin/nologin nagios [root@nagios ~]# mkdir /usr/local/nagios [root@nagios ~]# chown -R nagios.nagios /usr/local/nagios/ 2、安裝nagios-plugins 插件 [root@nagios tools]# tar zxf nagios-plugins-1.4.16.tar.gz [root@nagios tools]# cd nagios-plugins-1.4.16 [root@nagios tools nagios-plugins-1.4.16]# ./configure --prefix=/usr/local/nagios/ [root@nagios tools nagios-plugins-1.4.16]# make [root@nagios tools nagios-plugins-1.4.16]# make install [root@nagios tools nagios-plugins-1.4.16]# echo $? 0 3、安裝 Nrpe 插件 [root@nagios tools]# tar zxf nrpe-2.15.tar.gz [root@nagios tools]# cd nrpe-2.15 [root@nagios nrpe-2.15]# ./configure;make all;make install-plugin;make install-daemon;make install-daemon-config 編輯nrpe.cfg sed -I 's/allowed_hosts=127.0.0.1/allowed_hosts=127.0.0.1,192.168.0.2/g' /usr/local/nagios/etc/nrpe.cfg vim /usr/local/nagios/etc/nrpe.cfg command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10% command[check_data]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /data command[check_/]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p / command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 保存 echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg-d" >> /etc/rc.local 啟動Nrpe [root@nagios nrpe-2.15]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d [root@nagios nrpe-2.15]# netstat -antl |grep 5666 tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 這個在服務端操作,確保ok,如果不能請檢查客戶端防火墻和網絡是否允許通信 [root@nagios libexec]#/usr/local/nagios/libexec/check_nrpe -H 192.168.0.3 NRPE v2.15 關閉Nrpe [root@nagios libexec]# ps -ef | grep -v grep | grep nrpe [root@nagios libexec]# kill -9 進程號
pnp不出圖時候,查看日志
vim /usr/local/pnp4nagios/etc/process_perfdata.cfg
修改
LOG_LEVEL = 0
為
LOG_LEVEL = 2
more /usr/local/pnp4nagios/var/perfdata.log
提示:nagios 監控進程時候,即便pnp配置ok,也不會出圖,例如下面的
| OK | 10-20-2014 16:44:45 | 83d 1h 9m 16s | 1/3 | PROCS OK: 503 processes |
| OK | 10-20-2014 16:46:00 | 83d 1h 7m 58s | 1/3 | PROCS OK: 0 processes with STATE = Z |
PNP4Nagios Version 0.6.19 Please check the documentation for information about the following error.XML file "/usr/local/pnp4nagios/var/perfdata/app-11/Total_Processes.xml" not found. Read FAQ online file [line]:application/models/data.php [312]: back |
至于原因可以參考,非常詳細
http://storysky.blog.51cto.com/628458/583787/
Nagios如果系統監控插件滿足不了需求,可以自行開發插件
例如下面是一個內存監控插件,插件是百度找的還是不錯的,我這里借用一下
vim /usr/local/nagios/libexec/check_mem #!/bin/bash STAT_OK=0 STAT_WARNING=1 STAT_CRITICAL=2 STAT_UNKNOWN=3 total_mem=`free -m |awk 'NR==2{print $2}'` used_mem=`free -m |awk 'NR==3{print $3}'` #取的是系統真正用掉的內存 free_mem=`free -m |awk 'NR==3{print $4}'` #取的是free+cache的內存 use_per=`echo "scale=2;$used_mem/$total_mem"|bc|sed 's/^.//g'` help() { echo "USAGE:`basename $0` [-w] <used percent> [-c] <used percent> [-h]" exit -1 } while getopts ":w:c:h" opt do case $opt in w) warning=$OPTARG ;; c) critical=$OPTARG ;; h) help ;; ?) unkown=$OPTARG echo "error,plase check for help,USAGE:./`basename $0` -h" exit $STAT_UNKNOWN ;; esac done if [[ $use_per -lt $warning ]]; then echo "OK - total:$total_mem MB,used:$used_mem MB,free:$free_mem MB | total_mem=$total_mem used_mem=$used_mem free_mem=$free_mem" exit $STAT_OK elif [[ $use_per -ge $warning ]] && [[ $use_per -lt $critical ]]; then echo "WARNING - total:$total_mem MB,used:$used_mem MB,free:$free_mem MB | total_mem=$total_mem used_mem=$used_mem free_mem=$free_mem" exit $STAT_WARNING else echo "CRITICAL - total:$total_mem MB,used:$used_mem MB,free:$free_mem MB | total_mem=$total_mem used_mem=$used_mem free_mem=$free_mem" exit $STAT_CRITICAL fi fi 保存 chown nagios.nagios check_mem chmod +x check_mem ./check_mem -w 80 -c 90 OK - total:15926 MB,used:1839 MB,free:14086 MB | total_mem=15926 used_mem=1839 free_mem=14086 vim /usr/local/nagios/etc/nrpe.cfg 添加 command[check_free_mem]=/usr/local/nagios/libexec/check_mem -w 80 -c 90 重啟nrpe 在編輯/usr/local/nagios/etc/objects/app/的文件 添加 define service{ use local-service,service-pnp host_name cacti service_description Check_free_mem check_command check_nrpe!check_free_mem contact_groups admins flap_detection_enabled 0 } 檢查nagios 重啟nagios
Windows和交換機監控配置不難,只要思路清晰,肯定能弄出來,nagios配置其實不難,就是有點麻煩而已,只要把配置文件的關系弄明白,一切都很簡單
到此全部結束
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。