Nagios简介
Nagios是一款开源的计算机系统和网络监视工具, 能有效监控Windows、Linux和UNIX的主机和各种服务状态. 状态异常时会发出电子邮件或短信报警, 以便在第一时间通知网站运维人员. 在状态恢复后会发出正常的电子邮件或短信通知.
主要功能
网络服务监控: SMTP、POP3、HTTP、NNTP、ICMP、SNMP、FTP、SSH 主机资源监控: CPU load(CPU负载)、Disk usage(磁盘使用情况)、System logs(系统日志) --Windows主机使用NSClient++ plugin进行监控
Nagios监控系统的搭建与配置
搭建环境
主机 | 操作系统 | ip地址 |
Nagios | CentOS 7.3 | 192.168.100.11 |
MySQL | CentOS 7.3 | 192.168.100.12 |
准备工作
两台主机进行时间同步(暂时性)
1 | yum install -y ntp |
Nagios主机配置好邮箱
1 | yum install -y mailx |
mail.rc-QQ邮箱为例
1 | set from=xxx@qq.com smtp="smtp.qq.com" |
搭建Nagios监控环境
安装Nagios
关闭SeLinux
1
2sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
setenforce 0安装所需软件包
1
yum install -y gcc glibc glibc-common wget unzip httpd php gd gd-devel perl postfix
下载Nagios软件包
1
2
3cd /usr/local/src/
wget -O nagioscore.tar.gz https://github.com/NagiosEnterprises/nagioscore/archive/nagios-4.4.5.tar.gz
tar xzf nagioscore.tar.gz编译Nagios
1
2
3cd nagioscore-nagios-4.4.5/
./configure
make all创建用户并添加组
1
2make install-groups-users
usermod -a -G nagios apache二进制安装Nagios, 并安装服务(守护程序)、配置文件.
1
2
3
4
5
6
7make install
make install-daemoninit
systemctl enable httpd.service
make install-commandmode
make install-config
make install-webconf配置防火墙开启80端口
1
2firewall-cmd --zone=public --add-port=80/tcp
firewall-cmd --zone=public --add-port=80/tcp --permanent创建Nagios用户认证文件
1
2
3htpasswd -c local etc/htpasswd.users nagiosadmin
开启服务
1
2
3systemctl start httpd.service
systemctl start nagios.service
# url: http://192.168.100.11/nagios/
安装Nagios插件
192.168.100.11
安装所需软件包
1
2yum install -y gcc glibc glibc-common make gettext automake autoconf wget openssl-devel net-snmp net-snmp-utils epel-release
yum install -y perl-Net-SNMP下载Nagios插件
1
2
3cd /usr/local/src/
wget --no-check-certificate -O nagios-plugins.tar.gz https://github.com/nagios-plugins/nagios-plugins/archive/release-2.2.1.tar.gz
tar zxf nagios-plugins.tar.gz编译并安装
1
2
3
4cd /nagios-plugins-release-2.2.1/
./tools/setup
./configure
make && make install下载nrpe软件, 并上传解压.
1
2cd /usr/local/src/
tar zxf nrpe-3.2.1.tar.gz编译并安装
1
2
3cd nrpe-3.2.1
./configure
make all && make install-plugin && make install-config && make install-daemon
192.168.100.12
安装所需软件包
1
yum install -y openssl-devel gcc
下载Nagios插件
1
2
3cd /usr/local/src/
wget --no-check-certificate -O nagios-plugins.tar.gz https://github.com/nagios-plugins/nagios-plugins/archive/release-2.2.1.tar.gz
tar zxf nagios-plugins.tar.gz编译并安装
1
2
3
4
5
6cd nagios-plugins-release-2.2.1
libtoolize --force && aclocal && autoheader && automake --force- -- - && autoconf
./tools/setup
./configure
make && make install
chown -R nagios:nagios /usr/local/nagios/下载nrpe软件, 并上传解压.
1
2cd /usr/local/src/
tar zxf nrpe-3.2.1.tar.gz编译并安装
1
2
3cd nrpe-3.2.1
./configure
make all && make install-plugin && make install-config && make install-daemon
配置Nagios监控系统
配置顺序说明:
- 创建conf目录来存放hosts.cfg定义文件
- 使用默认自带的commands.cfg文件来定义命令
- 创建hosts.cfg文件来定义主机和主机组
- 使用默认自带的contacts.cfg文件来定义联系人和联系人组
- 使用默认自带的timeperiods.cfg文件来定义监控时间段
192.168.100.11
添加引用, 并创建存放目录.
1
2
3
4
5cd /usr/local/nagios/etc/
vi nagios.cfg
# 第55行
cfg_dir=/usr/local/nagios/etc/conf
mkdir /usr/local/nagios/etc/conf在默认自带的commands.cfg文件末尾中新增监控命令
1
vi objects/commands.cfg
commands.cfg-copy
1
2
3
4define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c$ARG1$
}编辑hosts.cfg文件定义主机和主机组
1
vi conf/hosts.cfg
hosts.cfg-copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53define host{
use linux-server
host_name 192.168.100.12
alias 192.168.100.12
address 192.168.100.12
hostgroups slaves
}
define hostgroup{
hostgroup_name slaves
alias Linux Servers
}
define service{
use generic-service
host_name 192.168.100.12
service_description check-load
check_command check_nrpe!check_load
contact_groups system
notifications_enabled 1
}
define service{
use generic-service
host_name 192.168.100.12
service_description check-users
check_command check_nrpe!check_users
contact_groups system
notifications_enabled 1
}
define service{
use generic-service
host_name 192.168.100.12
service_description otal_procs
check_command check_nrpe!check_total_procs
contact_groups system
notifications_enabled 1
}
define service{
use generic-service
host_name 192.168.100.12
service_description check-mariadb
check_command check_tcp!3306
contact_groups system
}在默认自带的contacts.cfg文件末尾中新增联系人和联系人组
1
vi objects/contacts.cfg
contacts.cfg-copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22define contact {
contact_name zl
use generic-contact
alias zl
email 邮箱地址1
}
define contact {
contact_name zl2
use generic-contact
alias zl2
email 邮箱地址2
}
define contactgroup {
contactgroup_name system
alias system
members zl,zl2
}编辑默认自带的timeperiods.cfg文件里的监控时间段, 以便查看监控效果.
1
2
3
4
5vi objects/templates.cfg
# 第72行
check_interval 1
# 第74行
max_check_attempts 2
检查文件语法的正确性
1 | /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg |
Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
语法无误后重启Nagios服务
1 | systemctl restart nagios.service |
192.168.100.12
被监控的主机配置允许监控端监控的ip地址
1 | vi /usr/local/nagios/etc/nrpe.cfg |
编辑nrpe.cfg文件, 开启监控插件.
1 | egrep -v "^#|^$" /usr/local/nagios/etc/nrpe.cfg |
安装MySQL, 启动服务.
1 | yum install -y mariadb-server |
定义文件的关联性说明
监控命令的定义文件
define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c$ARG1$ }
联系人和联系人组的定义文件
define contact { contact_name zl use generic-contact alias zl email 邮箱地址1 } define contact { contact_name zl2 use generic-contact alias zl2 email 邮箱地址2 } define contactgroup { contactgroup_name system alias system members zl,zl2 }
主机和主机组的定义文件
define host{ use linux-server host_name 192.168.100.12 alias 192.168.100.12 address 192.168.100.12 hostgroups slaves } define hostgroup{ hostgroup_name slaves alias Linux Servers } define service{ use generic-service host_name 192.168.100.12 service_description check-load check_command check_nrpe!check_load contact_groups system notifications_enabled 1 } define service{ use generic-service host_name 192.168.100.12 service_description check-users check_command check_nrpe!check_users contact_groups system notifications_enabled 1 } define service{ use generic-service host_name 192.168.100.12 service_description otal_procs check_command check_nrpe!check_total_procs contact_groups system notifications_enabled 1 } define service{ use generic-service host_name 192.168.100.12 service_description check-mariadb check_command check_tcp!3306 contact_groups system }