Monitoring engine and Nagvis not working as expected
Monitoring engine and Nagvis not working as expected
The monitoring engine was in a hung state and after restarting gearmand,worker service and Nagios service the monitoring engine is still not working as expected.
We can see that the monitoring engine event queue is not updating in a timely manner. Please see the attachment.
We suspect because of this Nagvis is showing error as per the attachment.
Need immediate assistance. We can have a webex session to troubleshoot this issue further.
We can see that the monitoring engine event queue is not updating in a timely manner. Please see the attachment.
We suspect because of this Nagvis is showing error as per the attachment.
Need immediate assistance. We can have a webex session to troubleshoot this issue further.
You do not have the required permissions to view the files attached to this post.
Re: Monitoring engine and Nagvis not working as expected
It looks like the ndo2db backend is not running on the server.
Lets stop the nagios daemon, start the backend and then restart nagios by running the following as root.
If you receive any errors on the above, post them here.
Lets stop the nagios daemon, start the backend and then restart nagios by running the following as root.
Code: Select all
service nagios stop
killall -9 nagios
service ndo2db start
service nagios startBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: Monitoring engine and Nagvis not working as expected
# killall -9 nagios
nagios: no process killed
# service ndo2db stop
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done.
and then started ndo2db service and then nagios services again
nagios: no process killed
# service ndo2db stop
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done.
and then started ndo2db service and then nagios services again
Re: Monitoring engine and Nagvis not working as expected
Did everything go back to normal?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Monitoring engine and Nagvis not working as expected
It was working for some time, but again we are seeing same issue. PFA
You do not have the required permissions to view the files attached to this post.
-
avandemore
- Posts: 1597
- Joined: Tue Sep 27, 2016 4:57 pm
Re: Monitoring engine and Nagvis not working as expected
What is the output of:
Code: Select all
# service ndo2db status
# service nagios status
# service gearmand statusPrevious Nagios employee
Re: Monitoring engine and Nagvis not working as expected
# service ndo2db status
ndo2db (pid 2640) is running...
root@nagmonus1:(06-07 07:40): /root
# service nagios status
nagios (pid 29289) is running...
root@nagmonus1:(06-07 07:40): /root
# service gearmand status
gearmand (pid 13359) is running...
Let me know what logs you need for further troubleshooting. This is really affecting our ability to monitor our environment.
Please handle this case as high priority and let us know the next steps.
ndo2db (pid 2640) is running...
root@nagmonus1:(06-07 07:40): /root
# service nagios status
nagios (pid 29289) is running...
root@nagmonus1:(06-07 07:40): /root
# service gearmand status
gearmand (pid 13359) is running...
Let me know what logs you need for further troubleshooting. This is really affecting our ability to monitor our environment.
Please handle this case as high priority and let us know the next steps.
Re: Monitoring engine and Nagvis not working as expected
Let me know if we can have a quick webex/screen share session fir further troubleshooting
Re: Monitoring engine and Nagvis not working as expected
as per nagios.log we observed below error
[1496844505] ndomod: Successfully reconnected to data sink! 2759 items lost, 5000 queued items to flush.
[1496844505] ndomod: Error writing to data sink! Some output may get lost. 4851 queued items to flush.
[1496844505] ndomod: Successfully reconnected to data sink! 2759 items lost, 5000 queued items to flush.
[1496844505] ndomod: Error writing to data sink! Some output may get lost. 4851 queued items to flush.
-
avandemore
- Posts: 1597
- Joined: Tue Sep 27, 2016 4:57 pm
Re: Monitoring engine and Nagvis not working as expected
XI > Admin > System Profile > Download Profile
Please include the zip file in your response. You can PM myself or other support personnel the profile.
Remotes are done if our support personnel deem it necessary and request it. If you want priority support, our phone support options allows you to call in at any point and jump the queue. Remotes are usually initiated in such a setting as it's generally needed for fast resolution. If you already have phone support, you can find that contact information here:
https://www.nagios.com/contact/
Please include the zip file in your response. You can PM myself or other support personnel the profile.
Remotes are done if our support personnel deem it necessary and request it. If you want priority support, our phone support options allows you to call in at any point and jump the queue. Remotes are usually initiated in such a setting as it's generally needed for fast resolution. If you already have phone support, you can find that contact information here:
https://www.nagios.com/contact/
Previous Nagios employee