Monitoring engine and Nagvis not working as expected

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Monitoring engine and Nagvis not working as expected

Post by bosecorp »

The monitoring engine was in a hung state and after restarting gearmand,worker service and Nagios service the monitoring engine is still not working as expected.
We can see that the monitoring engine event queue is not updating in a timely manner. Please see the attachment.

We suspect because of this Nagvis is showing error as per the attachment.
Need immediate assistance. We can have a webex session to troubleshoot this issue further.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Monitoring engine and Nagvis not working as expected

Post by tgriep »

It looks like the ndo2db backend is not running on the server.
Lets stop the nagios daemon, start the backend and then restart nagios by running the following as root.

Code: Select all

service nagios stop
killall -9 nagios
service ndo2db start
service nagios start
If you receive any errors on the above, post them here.
Be sure to check out our Knowledgebase for helpful articles and solutions!
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Monitoring engine and Nagvis not working as expected

Post by bosecorp »

# killall -9 nagios
nagios: no process killed

# service ndo2db stop
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done.

and then started ndo2db service and then nagios services again
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Monitoring engine and Nagvis not working as expected

Post by tgriep »

Did everything go back to normal?
Be sure to check out our Knowledgebase for helpful articles and solutions!
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Monitoring engine and Nagvis not working as expected

Post by bosecorp »

It was working for some time, but again we are seeing same issue. PFA
You do not have the required permissions to view the files attached to this post.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Monitoring engine and Nagvis not working as expected

Post by avandemore »

What is the output of:

Code: Select all

# service ndo2db status
# service nagios status
# service gearmand status
Previous Nagios employee
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Monitoring engine and Nagvis not working as expected

Post by bosecorp »

# service ndo2db status
ndo2db (pid 2640) is running...
root@nagmonus1:(06-07 07:40): /root
# service nagios status
nagios (pid 29289) is running...
root@nagmonus1:(06-07 07:40): /root
# service gearmand status
gearmand (pid 13359) is running...

Let me know what logs you need for further troubleshooting. This is really affecting our ability to monitor our environment.
Please handle this case as high priority and let us know the next steps.
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Monitoring engine and Nagvis not working as expected

Post by bosecorp »

Let me know if we can have a quick webex/screen share session fir further troubleshooting
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: Monitoring engine and Nagvis not working as expected

Post by bosecorp »

as per nagios.log we observed below error

[1496844505] ndomod: Successfully reconnected to data sink! 2759 items lost, 5000 queued items to flush.
[1496844505] ndomod: Error writing to data sink! Some output may get lost. 4851 queued items to flush.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Monitoring engine and Nagvis not working as expected

Post by avandemore »

XI > Admin > System Profile > Download Profile

Please include the zip file in your response. You can PM myself or other support personnel the profile.

Remotes are done if our support personnel deem it necessary and request it. If you want priority support, our phone support options allows you to call in at any point and jump the queue. Remotes are usually initiated in such a setting as it's generally needed for fast resolution. If you already have phone support, you can find that contact information here:

https://www.nagios.com/contact/
Previous Nagios employee
Locked