Commercial Support Clients: Clients with support contracts can get escalated support assistance by visiting Nagios Answer Hub. These forums are for community support services. Although we at Nagios try our best to help out on the forums here, we always give priority support to our support clients.

NagiosXI dint triggered notification

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.

NagiosXI dint triggered notification

Postby apteancloud » Thu Apr 08, 2021 8:02 am

Hi Team,

NagiosXI dint triggered notification on CPU load spike for one of our servers in Azure, As checked in Azure metrics we can see CPU utilization spiked up to 98%, and due to the CPU load spike, sever was hung for an hour and we had to reboot it. Please find the Nagios plugin we are using.

The alert dint triggered on Nagios, at that particular time frame it was 17% in Nagios performance graph

Code: Select all
[nagios@NagiosXIAzPrd ~]$ /usr/local/nagios/libexec/check_nt -H -p 12489 -s "sprt575" -v CPULOAD -l 15,85,90
CPU Load 3% (15 min average) | '15 min avg Load'=3%;85;90;0;100

Attached are both the Azure metric graph and Nagios Performance graph at the same time frame. Please check on this


Thanks in Advance
You do not have the required permissions to view the files attached to this post.
Posts: 47
Joined: Wed Sep 09, 2020 4:05 am

Re: NagiosXI dint triggered notification

Postby dchurch » Thu Apr 08, 2021 3:54 pm

How many CPU's are in the host?

Because the spike only reached ~20%, it seems to me that the load is being calculated by NSClient as being across all CPU's (absolute maximum being 100%) whereas I think your assumption was that it was that it would be in terms of individual CPU's, e.g. 100% for 1 CPU pegged, 200% for 2 CPU's pegged, etc.

You could try lowering the average time scale to, say 5 minutes, and decrease the check interval too. With a 15 minute average, the CPU would have to be pegged for 7 minutes straight to get the needle to move to 50%. So it would become -l 5,85,90

I'm not sure why the value is different between the Azure console and what Nagios captured. Perhaps NSClient is miscounting the CPU's? You could try decreasing the thresholds to 12% to work around this.

Really, though, NSClient (is deprecated, insecure, and hasn't been maintained since 2014. I'd consider replacing it with NCPA or NSClient++. You may have better results with NCPA, since I know that actually gives you an option to report on load averaged across CPU's, or summed.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Return to Nagios XI

Who is online

Users browsing this forum: No registered users and 20 guests