Check_HP keeps flapping

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
szkoda
Posts: 42
Joined: Thu Dec 04, 2014 8:48 am

Check_HP keeps flapping

Post by szkoda »

I've downloaded and (eventually) successfully configured the check_hp plug from Nagios exchange and initially it reported back absolutely fine. Now after adding a few more servers to Nagios the check_hp service seems to "flap" constantly with the error "no SNMP response". Thing is I know SNMP is working on these servers because I am using the check_snmp plugin which has never "flapped".
Running the check_hp command also works intermittently.

I'm not sure whether the plugin is buggy or if the Nagios server is getting overloaded somehow? It's a VM server with only 1GB of RAM assigned to it - could this be the issue?

I should also note that the only servers that seem to have this issue are remote ones that I am monitoring - local servers never have this service flap at all.

Any pointers or suggestions would be most appreciated.
szkoda
Posts: 42
Joined: Thu Dec 04, 2014 8:48 am

Re: Check_HP keeps flapping

Post by szkoda »

Just to add, I use NS client for all my other checks and that works flawlessly.

I've given the VM another core to work with as system monitor was reporting high CPU usage which seems to have made the HP checks start returning data.

I'll monitor and see if it reoccurs, any pointers in the meantime I would be grateful for.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Check_HP keeps flapping

Post by slansing »

What stats is it flapping between? Is it actually returning valid output at any time, or is it just giving you the same error output? Is there anything mechanical to the times when it's flapping? Every 5 minutes, or something like that?
szkoda
Posts: 42
Joined: Thu Dec 04, 2014 8:48 am

Re: Check_HP keeps flapping

Post by szkoda »

It flaps between actually giving a proper output from the plug (e.g. disk failed, network redundancy lost etc) and receiving no SNMP response at all.

Last night I added another 1GB of RAM and another virtual processor and (touch wood) it seems to have stabilised.

Is this the sort of behavior I can expect from Nagios if its being overloaded?
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Check_HP keeps flapping

Post by slansing »

Not typically no, without seeing where you were bottlenecking, if at all, it's hard to tell. Did you notice high load or something that indicated that?
szkoda
Posts: 42
Joined: Thu Dec 04, 2014 8:48 am

Re: Check_HP keeps flapping

Post by szkoda »

Ok it's still happening and I still cannot work out why - its doesn't seem to correlate with high loads or anything in the Nagios event log. The only additional info I can give you is that when it happens, its always more than one server that the HP check fails on, not necessarily all of them though and not necessarily the same servers.

I'm baffled!
szkoda
Posts: 42
Joined: Thu Dec 04, 2014 8:48 am

Re: Check_HP keeps flapping

Post by szkoda »

Interestingly when it happens, the check_SNMP service stays up ok so Nagios must be getting some sort of response from it!
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: Check_HP keeps flapping

Post by tgriep »

Try increasing the timeout value for the check_hp command and see if that helps out.
Be sure to check out our Knowledgebase for helpful articles and solutions!
szkoda
Posts: 42
Joined: Thu Dec 04, 2014 8:48 am

Re: Check_HP keeps flapping

Post by szkoda »

Will give it a go, they've just released a new version as well so I'll try that as well.
szkoda
Posts: 42
Joined: Thu Dec 04, 2014 8:48 am

Re: Check_HP keeps flapping

Post by szkoda »

Stupid question - I've just realized I have no idea how to change the timeout value lol

Can anyone tell me the correct syntax please?
Locked