Service Check Timeout

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
sarfarosh
Posts: 211
Joined: Fri Oct 05, 2012 3:56 am

Service Check Timeout

Post by sarfarosh »

Hi Team,
I was getting timeout error for service BGP Peer Status, so i increased the service check timeout to -t 120 for check_nwc_health plugin and when i run this command from CLI i am getting output. But on nagios xi its still showing me error as "UNKNOW : Timeout after 60 seconds". I have increased service_check_timeout = 120 in /usr/local/nagios/etc/nagios.cfg and restarted nagios service. Please let me know if i am missing something or doing something wrong.
You do not have the required permissions to view the files attached to this post.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Service Check Timeout

Post by mcapra »

Just so you're aware, setting the service_check_timeout to be greater than 60 can be dangerous with Nagios XI since this setting is global and applies to all service checks. If your interval is set to 60 seconds, and you have service checks that have 1 set for the check_interval or retry_interval, you can create a situation where checks are overlapping. Bad news.

You've done everything relatively correctly from the Nagios XI end of things, but if the script is running for longer than 60 seconds you might want to diagnose what's causing that root behavior rather than trying to mitigate it by bumping up timeouts.
sarfarosh wrote:when i run this command from CLI i am getting output.
How long is the command taking to complete when it's executed from the CLI? Can you run this command with debug output using the -v or --verbose flag and share the output?

If this command is taking a really super duper long time to run, it might be best to schedule it as a cron job and return the results to Nagios XI as a passive check.
Former Nagios employee
https://www.mcapra.com/
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Service Check Timeout

Post by tgriep »

Changing the service timeout in the nagios.cfg file should of increased it for you so that check can run longer.
Try stopping and starting the nagios process by running the following as root to see if the setting is updated.

Code: Select all

service nagios stop
killall -9 nagios
service nagios start
Try that and see if the timeout is increased to 120 seconds.
Be sure to check out our Knowledgebase for helpful articles and solutions!
sarfarosh
Posts: 211
Joined: Fri Oct 05, 2012 3:56 am

Re: Service Check Timeout

Post by sarfarosh »

Hi mcapra,
I have changed back the service_check_timeout to 60 in nagios.cfg. Please find the attachment for the timeout and Make/Model of device. From CLI it takes between 70-75 seconds to fetch output.

@tgriep, I already did that but as @mcapra said it may result in overlapping of checks. So i am much interested in finding the cause why it is taking too much time.

NOTE: I am able to get the response for other parameter like CPU,Memory Utilization for the same device in 5-6 seconds.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Service Check Timeout

Post by tgriep »

Setting the global timeout to 120 seconds should be fine as most plugins by default have a shorter timeout built into them so only the plugins you specify, will use the extended timeout.
The first thing to do is to update the plugin to the latest version to see if that speeds up it usage.
https://labs.consol.de/nagios/check_nwc ... index.html
Be sure to check out our Knowledgebase for helpful articles and solutions!
sarfarosh
Posts: 211
Joined: Fri Oct 05, 2012 3:56 am

Re: Service Check Timeout

Post by sarfarosh »

Hi tgriep,
Upgrading the plugin solved issue. Thank you very much.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Service Check Timeout

Post by tgriep »

I am glad upgrading the plugin worked for you. I'll mark the post as solved and lock it. If you have any questions in the future, feel free to open a new post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked