0%

Nagios监控系统


Nagios简介

Nagios是一款开源的计算机系统和网络监视工具, 能有效监控Windows、Linux和UNIX的主机和各种服务状态. 状态异常时会发出电子邮件或短信报警, 以便在第一时间通知网站运维人员. 在状态恢复后会发出正常的电子邮件或短信通知.

主要功能

 网络服务监控: SMTP、POP3、HTTP、NNTP、ICMP、SNMP、FTP、SSH
 主机资源监控: CPU load(CPU负载)、Disk usage(磁盘使用情况)、System logs(系统日志)
 --Windows主机使用NSClient++ plugin进行监控

Nagios监控系统的搭建与配置

搭建环境

主机操作系统ip地址
NagiosCentOS 7.3192.168.100.11
MySQLCentOS 7.3192.168.100.12

准备工作

两台主机进行时间同步(暂时性)

1
2
yum install -y ntp
ntpdate ntp1.aliyun.com

Nagios主机配置好邮箱

1
2
3
4
5
yum install -y mailx
vi /etc/mail.rc

# 发送邮件进行测试
echo "--------test---------" | mail -s "mailx-test" xxx@qq.com

mail.rc-QQ邮箱为例

1
2
3
set from=xxx@qq.com smtp="smtp.qq.com"
set smtp-auth-user="xxx@qq.com" smtp-auth-password="授权码"
set smtp-auth=login

搭建Nagios监控环境

安装Nagios

  1. 关闭SeLinux

    1
    2
    sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
    setenforce 0
  2. 安装所需软件包

    1
    yum install -y gcc glibc glibc-common wget unzip httpd php gd gd-devel perl postfix
  3. 下载Nagios软件包

    1
    2
    3
    cd /usr/local/src/
    wget -O nagioscore.tar.gz https://github.com/NagiosEnterprises/nagioscore/archive/nagios-4.4.5.tar.gz
    tar xzf nagioscore.tar.gz
  4. 编译Nagios

    1
    2
    3
    cd nagioscore-nagios-4.4.5/
    ./configure
    make all
  5. 创建用户并添加组

    1
    2
    make install-groups-users
    usermod -a -G nagios apache
  6. 二进制安装Nagios, 并安装服务(守护程序)、配置文件.

    1
    2
    3
    4
    5
    6
    7
    make install
    make install-daemoninit
    systemctl enable httpd.service

    make install-commandmode
    make install-config
    make install-webconf
  7. 配置防火墙开启80端口

    1
    2
    firewall-cmd --zone=public --add-port=80/tcp
    firewall-cmd --zone=public --add-port=80/tcp --permanent
  8. 创建Nagios用户认证文件

    1
    2
    3
    htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
    # 输入密码:
    # 再次输入密码:
  9. 开启服务

    1
    2
    3
    systemctl start httpd.service
    systemctl start nagios.service
    # url: http://192.168.100.11/nagios/

安装Nagios插件

192.168.100.11

  1. 安装所需软件包

    1
    2
    yum install -y gcc glibc glibc-common make gettext automake autoconf wget openssl-devel net-snmp net-snmp-utils epel-release 
    yum install -y perl-Net-SNMP
  2. 下载Nagios插件

    1
    2
    3
    cd /usr/local/src/
    wget --no-check-certificate -O nagios-plugins.tar.gz https://github.com/nagios-plugins/nagios-plugins/archive/release-2.2.1.tar.gz
    tar zxf nagios-plugins.tar.gz
  3. 编译并安装

    1
    2
    3
    4
    cd /nagios-plugins-release-2.2.1/
    ./tools/setup
    ./configure
    make && make install
  4. 下载nrpe软件, 并上传解压.

    1
    2
    cd /usr/local/src/
    tar zxf nrpe-3.2.1.tar.gz
  5. 编译并安装

    1
    2
    3
    cd nrpe-3.2.1
    ./configure
    make all && make install-plugin && make install-config && make install-daemon

192.168.100.12

  1. 安装所需软件包

    1
    yum install -y openssl-devel gcc
  2. 下载Nagios插件

    1
    2
    3
    cd /usr/local/src/
    wget --no-check-certificate -O nagios-plugins.tar.gz https://github.com/nagios-plugins/nagios-plugins/archive/release-2.2.1.tar.gz
    tar zxf nagios-plugins.tar.gz
  3. 编译并安装

    1
    2
    3
    4
    5
    6
    cd nagios-plugins-release-2.2.1
    libtoolize --force && aclocal && autoheader && automake --force-missing --add-missing && autoconf
    ./tools/setup
    ./configure
    make && make install
    chown -R nagios:nagios /usr/local/nagios/
  4. 下载nrpe软件, 并上传解压.

    1
    2
    cd /usr/local/src/
    tar zxf nrpe-3.2.1.tar.gz
  5. 编译并安装

    1
    2
    3
    cd nrpe-3.2.1
    ./configure
    make all && make install-plugin && make install-config && make install-daemon

配置Nagios监控系统

配置顺序说明:

  1. 创建conf目录来存放hosts.cfg定义文件
  2. 使用默认自带的commands.cfg文件来定义命令
  3. 创建hosts.cfg文件来定义主机和主机组
  4. 使用默认自带的contacts.cfg文件来定义联系人和联系人组
  5. 使用默认自带的timeperiods.cfg文件来定义监控时间段

192.168.100.11

  1. 添加引用, 并创建存放目录.

    1
    2
    3
    4
    5
    cd /usr/local/nagios/etc/
    vi nagios.cfg
    # 第55行
    cfg_dir=/usr/local/nagios/etc/conf
    mkdir /usr/local/nagios/etc/conf
  2. 在默认自带的commands.cfg文件末尾中新增监控命令

    1
    vi objects/commands.cfg

    commands.cfg-copy

    1
    2
    3
    4
    define command{
    command_name check_nrpe
    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c$ARG1$
    }
  3. 编辑hosts.cfg文件定义主机和主机组

    1
    vi conf/hosts.cfg

    hosts.cfg-copy

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    define host{

    use linux-server
    host_name 192.168.100.12
    alias 192.168.100.12
    address 192.168.100.12
    hostgroups slaves
    }

    define hostgroup{

    hostgroup_name slaves
    alias Linux Servers
    }

    define service{

    use generic-service
    host_name 192.168.100.12
    service_description check-load
    check_command check_nrpe!check_load
    contact_groups system
    notifications_enabled 1
    }

    define service{

    use generic-service
    host_name 192.168.100.12
    service_description check-users
    check_command check_nrpe!check_users
    contact_groups system
    notifications_enabled 1
    }

    define service{

    use generic-service
    host_name 192.168.100.12
    service_description otal_procs
    check_command check_nrpe!check_total_procs
    contact_groups system
    notifications_enabled 1
    }

    define service{

    use generic-service
    host_name 192.168.100.12
    service_description check-mariadb
    check_command check_tcp!3306
    contact_groups system
    }
  4. 在默认自带的contacts.cfg文件末尾中新增联系人和联系人组

    1
    vi objects/contacts.cfg

    contacts.cfg-copy

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    define contact {

    contact_name zl
    use generic-contact
    alias zl
    email 邮箱地址1
    }

    define contact {

    contact_name zl2
    use generic-contact
    alias zl2
    email 邮箱地址2
    }

    define contactgroup {

    contactgroup_name system
    alias system
    members zl,zl2
    }
  5. 编辑默认自带的timeperiods.cfg文件里的监控时间段, 以便查看监控效果.

    1
    2
    3
    4
    5
    vi objects/templates.cfg
    # 第72行
    check_interval 1
    # 第74行
    max_check_attempts 2

检查文件语法的正确性

1
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

语法无误后重启Nagios服务

1
2
systemctl restart nagios.service
# url: http://192.168.100.11/nagios/

192.168.100.12

被监控的主机配置允许监控端监控的ip地址

1
2
vi /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,192.168.100.11

编辑nrpe.cfg文件, 开启监控插件.

1
egrep -v "^#|^$" /usr/local/nagios/etc/nrpe.cfg

安装MySQL, 启动服务.

1
2
yum install -y mariadb-server
systemctl start mariadb

定义文件的关联性说明

监控命令的定义文件

define command{
    command_name            check_nrpe
    command_line            $USER1$/check_nrpe -H $HOSTADDRESS$ -c$ARG1$
}

联系人和联系人组的定义文件

define contact {

    contact_name            zl
    use                     generic-contact
    alias                   zl
    email                   邮箱地址1
}

define contact {

    contact_name            zl2
    use                     generic-contact
    alias                   zl2
    email                   邮箱地址2
}

define contactgroup {

    contactgroup_name       system
    alias                   system
    members                 zl,zl2
}

主机和主机组的定义文件

define host{

    use                     linux-server
    host_name               192.168.100.12
    alias                   192.168.100.12
    address                 192.168.100.12
    hostgroups              slaves
}

define hostgroup{

    hostgroup_name          slaves
    alias                   Linux Servers
}

define service{

    use                     generic-service 
    host_name               192.168.100.12
    service_description     check-load 
    check_command           check_nrpe!check_load
    contact_groups          system
    notifications_enabled   1
}

define service{

    use                     generic-service 
    host_name               192.168.100.12
    service_description     check-users
    check_command           check_nrpe!check_users
    contact_groups          system
    notifications_enabled   1
}

define service{

    use                     generic-service 
    host_name               192.168.100.12
    service_description     otal_procs
    check_command           check_nrpe!check_total_procs
    contact_groups          system
    notifications_enabled   1
}

define service{

    use                     generic-service 
    host_name               192.168.100.12
    service_description     check-mariadb
    check_command           check_tcp!3306
    contact_groups          system
}
-------------------本文结束 感谢阅读-------------------