Nagios core CLI is working but Nagios web console is not.
Posted: Wed Dec 19, 2018 2:32 am
Hi Team,
I hope everyone is warm and energetic in this winter
It's a weird issue that I am facing.
Our Nagios core CLI is working fine but our Nagios web console is not working properly (sometimes it works and sometimes it's not -- it's happening automatically )
I tried a few steps to rectify the issue:
1) Checking logs of Nagios and apache2
Nagios logs
tail -f nagios.log
[1545198312] wproc: stdout line 79: 250 2.1.5 <nagios@ip-172-31-32-238.ap-south-1.compute.internal>... Recipient ok
[1545198312] wproc: stdout line 80: 354 Enter mail, end with "." on a line by itself
[1545198312] wproc: stdout line 81: >>> .
[1545198312] wproc: stdout line 82: 050 <nagios@ip-172-31-32-238.ap-south-1.compute.internal>... Connecting to local...
[1545198312] wproc: stdout line 83: 050 <nagios@ip-172-31-32-238.ap-south-1.compute.internal>... Sent
[1545198312] wproc: stdout line 84: 250 2.0.0 wBJ5j6iv006498 Message accepted for delivery
[1545198312] wproc: stdout line 85: nagios... Sent (wBJ5j6iv006498 Message accepted for delivery)
[1545198312] wproc: stdout line 86: Closing connection to [127.0.0.1]
[1545198312] wproc: stdout line 87: >>> QUIT
[1545198312] wproc: stdout line 88: 221 2.0.0 ip-172-31-32-238.ap-south-1.compute.internal closing connection
Apache2 error logs
tail -f error.log
[Wed Dec 19 06:25:01.490007 2018] [mpm_prefork:notice] [pid 3872] AH00163: Apache/2.4.18 (Ubuntu) configured -- resuming normal operations
[Wed Dec 19 06:25:01.490031 2018] [core:notice] [pid 3872] AH00094: Command line: '/usr/sbin/apache2'
Apache2 access logs
tail -f access.log
127.0.0.1 - - [19/Dec/2018:06:32:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:37:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:42:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:47:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
192.168.5.224 - - [19/Dec/2018:06:50:25 +0000] "GET /nagios/cgi-bin/status.cgi?host=all&servicestatustypes=28 HTTP/1.1" 401 748 "http://nagios.lendingkart.com/nagios/cg ... ustypes=28" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
192.168.5.224 - - [19/Dec/2018:06:50:35 +0000] "GET /nagios/cgi-bin/status.cgi?hostgroup=all&style=hostdetail&hoststatustypes=12 HTTP/1.1" 401 748 "http://nagios.lendingkart.com/nagios/cg ... ustypes=12" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
192.168.5.224 - - [19/Dec/2018:06:50:45 +0000] "GET /nagios/cgi-bin/outages.cgi HTTP/1.1" 401 748 "http://nagios.lendingkart.com/nagios/cg ... utages.cgi" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
127.0.0.1 - - [19/Dec/2018:06:52:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:57:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:07:02:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
2) I tried checking apache2 and Nagios service were running or not while the Nagios UI was not working
Ans is yes, it was working fine
$sudo service apache2 status
● apache2.service - LSB: Apache2 web server
Loaded: loaded (/etc/init.d/apache2; bad; vendor preset: enabled)
Drop-In: /lib/systemd/system/apache2.service.d
└─apache2-systemd.conf
Active: active (running) since Tue 2018-12-18 12:40:11 UTC; 17h ago
Docs: man:systemd-sysv-generator(8)
Process: 3754 ExecStop=/etc/init.d/apache2 stop (code=exited, status=0/SUCCESS)
Process: 29913 ExecReload=/etc/init.d/apache2 reload (code=exited, status=0/SUCCESS)
Process: 3855 ExecStart=/etc/init.d/apache2 start (code=exited, status=0/SUCCESS)
Tasks: 11
Memory: 41.5M
CPU: 2.989s
CGroup: /system.slice/apache2.service
├─ 3872 /usr/sbin/apache2 -k start
├─ 3875 /usr/sbin/apache2 -k start
├─ 3876 /usr/sbin/apache2 -k start
├─ 3877 /usr/sbin/apache2 -k start
├─ 3879 /usr/sbin/apache2 -k start
├─ 8393 /usr/sbin/apache2 -k start
├─ 8449 /usr/sbin/apache2 -k start
├─ 8462 /usr/sbin/apache2 -k start
├─ 8463 /usr/sbin/apache2 -k start
├─ 8464 /usr/sbin/apache2 -k start
└─13687 /usr/sbin/apache2 -k start
$sudo service nagios status
● nagios.service - Nagios
Loaded: loaded (/etc/systemd/system/nagios.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2018-12-18 12:40:47 UTC; 18h ago
Main PID: 4017 (nagios)
Tasks: 8
Memory: 41.4M
CPU: 4min 10.724s
CGroup: /system.slice/nagios.service
├─ 4017 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
├─ 4018 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4019 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4020 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4021 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4024 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
├─15379 /usr/local/nagios/libexec/check_ping -H 172.31.32.115 -w 3000.0,80% -c 5000.0,100% -p 5
└─15380 /bin/ping -n -U -W 30 -c 5 172.31.32.115
Dec 19 07:04:17 ip-172-31-32-238 check_nrpe[14527]: Remote 172.31.33.151 does not support Version 3 Packets
Dec 19 07:04:17 ip-172-31-32-238 check_nrpe[14527]: Remote 172.31.33.151 accepted a Version 2 Packet
Dec 19 07:05:07 ip-172-31-32-238 check_nrpe[14603]: Remote 172.31.27.83 does not support Version 3 Packets
Dec 19 07:05:07 ip-172-31-32-238 check_nrpe[14603]: Remote 172.31.27.83 accepted a Version 2 Packet
Dec 19 07:05:10 ip-172-31-32-238 check_nrpe[14604]: Remote 172.31.33.204 does not support Version 3 Packets
Dec 19 07:05:10 ip-172-31-32-238 check_nrpe[14604]: Remote 172.31.33.204 accepted a Version 2 Packet
Dec 19 07:05:14 ip-172-31-32-238 check_nrpe[14613]: Remote 172.31.33.151 does not support Version 3 Packets
Dec 19 07:05:14 ip-172-31-32-238 check_nrpe[14613]: Remote 172.31.33.151 accepted a Version 2 Packet
Dec 19 07:08:02 ip-172-31-32-238 check_nrpe[15272]: Remote 172.31.33.151 does not support Version 3 Packets
Dec 19 07:08:02 ip-172-31-32-238 check_nrpe[15272]: Remote 172.31.33.151 accepted a Version 2 Packet
$sudo netstat -nlp |grep 80
tcp6 0 0 :::80 :::* LISTEN 3872/apache2
I checked everything is working fine in Nagios, we are getting emails properly but the only problem is Nagios core UI (web interface) is not responding sometimes (sometimes it's working fine)
Note: One more information I want to give you guys to have a deep look into the issue is as below:
I ran this curl command from my local machine (our Nagios is in private subnet and we are using VPN in our organization)
## This is while everyting was running except Nagios core WebUI ##
lendingkart@LK-LP-565:~$ curl -I nagios.lendingkart.com
HTTP/1.1 502 Connection refused
Date: Tue, 18 Dec 2018 14:08:37 GMT
Cache-Control: no-cache
Pragma: no-cache
Content-Type: text/html; charset="UTF-8"
Content-Length: 71993
Via: HTTP/1.1 forward.http.proxy:3128
Connection: close
## This is while everything was running including Nagios core Web UI ##
lendingkart@LK-LP-565:~$ curl -I nagios.lendingkart.com
HTTP/1.1 200 OK
Date: Tue, 18 Dec 2018 14:20:00 GMT
Server: Apache/2.4.18 (Ubuntu)
Last-Modified: Mon, 25 Dec 2017 18:09:35 GMT
ETag: "2c39-5612e1196dfcf"
Accept-Ranges: bytes
Content-Length: 11321
Vary: Accept-Encoding
Keep-Alive: timeout=5, max=100
Content-Type: text/html
Via: HTTP/1.1 forward.http.proxy:3128
Connection: keep-alive
Can someone please help us in solving this
I hope everyone is warm and energetic in this winter
It's a weird issue that I am facing.
Our Nagios core CLI is working fine but our Nagios web console is not working properly (sometimes it works and sometimes it's not -- it's happening automatically )
I tried a few steps to rectify the issue:
1) Checking logs of Nagios and apache2
Nagios logs
tail -f nagios.log
[1545198312] wproc: stdout line 79: 250 2.1.5 <nagios@ip-172-31-32-238.ap-south-1.compute.internal>... Recipient ok
[1545198312] wproc: stdout line 80: 354 Enter mail, end with "." on a line by itself
[1545198312] wproc: stdout line 81: >>> .
[1545198312] wproc: stdout line 82: 050 <nagios@ip-172-31-32-238.ap-south-1.compute.internal>... Connecting to local...
[1545198312] wproc: stdout line 83: 050 <nagios@ip-172-31-32-238.ap-south-1.compute.internal>... Sent
[1545198312] wproc: stdout line 84: 250 2.0.0 wBJ5j6iv006498 Message accepted for delivery
[1545198312] wproc: stdout line 85: nagios... Sent (wBJ5j6iv006498 Message accepted for delivery)
[1545198312] wproc: stdout line 86: Closing connection to [127.0.0.1]
[1545198312] wproc: stdout line 87: >>> QUIT
[1545198312] wproc: stdout line 88: 221 2.0.0 ip-172-31-32-238.ap-south-1.compute.internal closing connection
Apache2 error logs
tail -f error.log
[Wed Dec 19 06:25:01.490007 2018] [mpm_prefork:notice] [pid 3872] AH00163: Apache/2.4.18 (Ubuntu) configured -- resuming normal operations
[Wed Dec 19 06:25:01.490031 2018] [core:notice] [pid 3872] AH00094: Command line: '/usr/sbin/apache2'
Apache2 access logs
tail -f access.log
127.0.0.1 - - [19/Dec/2018:06:32:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:37:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:42:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:47:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
192.168.5.224 - - [19/Dec/2018:06:50:25 +0000] "GET /nagios/cgi-bin/status.cgi?host=all&servicestatustypes=28 HTTP/1.1" 401 748 "http://nagios.lendingkart.com/nagios/cg ... ustypes=28" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
192.168.5.224 - - [19/Dec/2018:06:50:35 +0000] "GET /nagios/cgi-bin/status.cgi?hostgroup=all&style=hostdetail&hoststatustypes=12 HTTP/1.1" 401 748 "http://nagios.lendingkart.com/nagios/cg ... ustypes=12" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
192.168.5.224 - - [19/Dec/2018:06:50:45 +0000] "GET /nagios/cgi-bin/outages.cgi HTTP/1.1" 401 748 "http://nagios.lendingkart.com/nagios/cg ... utages.cgi" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
127.0.0.1 - - [19/Dec/2018:06:52:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:06:57:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
127.0.0.1 - - [19/Dec/2018:07:02:06 +0000] "GET / HTTP/1.0" 200 11595 "-" "check_http/v2.2.1 (nagios-plugins 2.2.1)"
2) I tried checking apache2 and Nagios service were running or not while the Nagios UI was not working
Ans is yes, it was working fine
$sudo service apache2 status
● apache2.service - LSB: Apache2 web server
Loaded: loaded (/etc/init.d/apache2; bad; vendor preset: enabled)
Drop-In: /lib/systemd/system/apache2.service.d
└─apache2-systemd.conf
Active: active (running) since Tue 2018-12-18 12:40:11 UTC; 17h ago
Docs: man:systemd-sysv-generator(8)
Process: 3754 ExecStop=/etc/init.d/apache2 stop (code=exited, status=0/SUCCESS)
Process: 29913 ExecReload=/etc/init.d/apache2 reload (code=exited, status=0/SUCCESS)
Process: 3855 ExecStart=/etc/init.d/apache2 start (code=exited, status=0/SUCCESS)
Tasks: 11
Memory: 41.5M
CPU: 2.989s
CGroup: /system.slice/apache2.service
├─ 3872 /usr/sbin/apache2 -k start
├─ 3875 /usr/sbin/apache2 -k start
├─ 3876 /usr/sbin/apache2 -k start
├─ 3877 /usr/sbin/apache2 -k start
├─ 3879 /usr/sbin/apache2 -k start
├─ 8393 /usr/sbin/apache2 -k start
├─ 8449 /usr/sbin/apache2 -k start
├─ 8462 /usr/sbin/apache2 -k start
├─ 8463 /usr/sbin/apache2 -k start
├─ 8464 /usr/sbin/apache2 -k start
└─13687 /usr/sbin/apache2 -k start
$sudo service nagios status
● nagios.service - Nagios
Loaded: loaded (/etc/systemd/system/nagios.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2018-12-18 12:40:47 UTC; 18h ago
Main PID: 4017 (nagios)
Tasks: 8
Memory: 41.4M
CPU: 4min 10.724s
CGroup: /system.slice/nagios.service
├─ 4017 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
├─ 4018 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4019 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4020 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4021 /usr/local/nagios/bin/nagios --worker /usr/local/nagios/var/rw/nagios.qh
├─ 4024 /usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg
├─15379 /usr/local/nagios/libexec/check_ping -H 172.31.32.115 -w 3000.0,80% -c 5000.0,100% -p 5
└─15380 /bin/ping -n -U -W 30 -c 5 172.31.32.115
Dec 19 07:04:17 ip-172-31-32-238 check_nrpe[14527]: Remote 172.31.33.151 does not support Version 3 Packets
Dec 19 07:04:17 ip-172-31-32-238 check_nrpe[14527]: Remote 172.31.33.151 accepted a Version 2 Packet
Dec 19 07:05:07 ip-172-31-32-238 check_nrpe[14603]: Remote 172.31.27.83 does not support Version 3 Packets
Dec 19 07:05:07 ip-172-31-32-238 check_nrpe[14603]: Remote 172.31.27.83 accepted a Version 2 Packet
Dec 19 07:05:10 ip-172-31-32-238 check_nrpe[14604]: Remote 172.31.33.204 does not support Version 3 Packets
Dec 19 07:05:10 ip-172-31-32-238 check_nrpe[14604]: Remote 172.31.33.204 accepted a Version 2 Packet
Dec 19 07:05:14 ip-172-31-32-238 check_nrpe[14613]: Remote 172.31.33.151 does not support Version 3 Packets
Dec 19 07:05:14 ip-172-31-32-238 check_nrpe[14613]: Remote 172.31.33.151 accepted a Version 2 Packet
Dec 19 07:08:02 ip-172-31-32-238 check_nrpe[15272]: Remote 172.31.33.151 does not support Version 3 Packets
Dec 19 07:08:02 ip-172-31-32-238 check_nrpe[15272]: Remote 172.31.33.151 accepted a Version 2 Packet
$sudo netstat -nlp |grep 80
tcp6 0 0 :::80 :::* LISTEN 3872/apache2
I checked everything is working fine in Nagios, we are getting emails properly but the only problem is Nagios core UI (web interface) is not responding sometimes (sometimes it's working fine)
Note: One more information I want to give you guys to have a deep look into the issue is as below:
I ran this curl command from my local machine (our Nagios is in private subnet and we are using VPN in our organization)
## This is while everyting was running except Nagios core WebUI ##
lendingkart@LK-LP-565:~$ curl -I nagios.lendingkart.com
HTTP/1.1 502 Connection refused
Date: Tue, 18 Dec 2018 14:08:37 GMT
Cache-Control: no-cache
Pragma: no-cache
Content-Type: text/html; charset="UTF-8"
Content-Length: 71993
Via: HTTP/1.1 forward.http.proxy:3128
Connection: close
## This is while everything was running including Nagios core Web UI ##
lendingkart@LK-LP-565:~$ curl -I nagios.lendingkart.com
HTTP/1.1 200 OK
Date: Tue, 18 Dec 2018 14:20:00 GMT
Server: Apache/2.4.18 (Ubuntu)
Last-Modified: Mon, 25 Dec 2017 18:09:35 GMT
ETag: "2c39-5612e1196dfcf"
Accept-Ranges: bytes
Content-Length: 11321
Vary: Accept-Encoding
Keep-Alive: timeout=5, max=100
Content-Type: text/html
Via: HTTP/1.1 forward.http.proxy:3128
Connection: keep-alive
Can someone please help us in solving this