Page 1 of 1

check_all_procs start reporting wrong every morning around 4

Posted: Wed Feb 19, 2020 5:34 am
by davidrk
Having a weird problem.

Using Nagios to monitor 2 servers, using check_all_procs, and every morning on both server, Nagios stops reporting the processes correctly.
Typically they are between 110-140, but every morning around 4am ET they shoot down to 2-3.

Reboot the servers, and all is well again until around 4am.

Nothing is running on the servers at that time to be causing an issue.

Anyone have any suggestions where to start looking?

Thanks,
David

Re: check_all_procs start reporting wrong every morning arou

Posted: Wed Feb 19, 2020 5:12 pm
by Box293
What is in the logs?
/var/log/messages
/usr/local/nagios/var/nagios.log

You may need to enable debug logging on Nagios, try setting the debug level on and then restart Nagios.

Code: Select all

sed -i 's/.*debug_level=.*/debug_level=-1/g' /usr/local/nagios/etc/nagios.cfg
service nagios restart
Check the file /usr/local/nagios/var/nagios.debug

When you are finished this turns debugging off:

Code: Select all

sed -i 's/.*debug_level=.*/debug_level=0/g' /usr/local/nagios/etc/nagios.cfg
service nagios restart

Re: check_all_procs start reporting wrong every morning arou

Posted: Thu Feb 20, 2020 5:27 am
by davidrk
Only thing I am seeing close to that time is the log rotation on the server.

But don't see how that would be affection Nagios.

Re: check_all_procs start reporting wrong every morning arou

Posted: Thu Feb 20, 2020 5:25 pm
by Box293
Did you enable debug logging ?

Re: check_all_procs start reporting wrong every morning arou

Posted: Fri Feb 21, 2020 4:08 am
by davidrk
I did enable it, and it failed on 2 of the 3 servers being monitored.
All are running CentOS 7 and CWP.

Can I PM you the debug log to look at?

Re: check_all_procs start reporting wrong every morning arou

Posted: Fri Feb 21, 2020 3:24 pm
by benjaminsmith
Hello David,
Can I PM you the debug log to look at?
Certainly. Please send it to me in PM and I can share this with Box293. Thanks.

Re: check_all_procs start reporting wrong every morning arou

Posted: Sat Feb 22, 2020 4:17 am
by davidrk
I don't think this is a Nagios XI problem, since the 3rd server which happens to be a Non Pro CWP server is working correctly with stats.

Attached is the debug file, if someone can look just to double check it.
ATL3 and ATL6 have the problem, and have to be rebooted to start reporting correctly again, but EWR3 is OK.


Thanks,
David

Re: check_all_procs start reporting wrong every morning arou

Posted: Sat Feb 22, 2020 4:46 am
by davidrk
Here is the graph from Nagios.

Weird.
All is well after a manually reboot of the server.

Re: check_all_procs start reporting wrong every morning arou

Posted: Mon Feb 24, 2020 4:21 pm
by Box293
This completely seems like an issue on your servers and nothing to do with Nagios.