All Linux Server CPU Spike at same time

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: All Linux Server CPU Spike at same time

Post by tgriep »

Can you login to the systems you want to run the check_load command using check_nrpe and run the following commands and post the output?

Code: Select all

ls -l /usr/local/nagios/libexec/
su nagios
/usr/local/nagios/libexec/check_load w 5,10,15 -c 6,11,17
Thanks
Be sure to check out our Knowledgebase for helpful articles and solutions!
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: All Linux Server CPU Spike at same time

Post by kwhogster »

this was done on the Nagios server itself which is one of my Linux server that has this issue

Code: Select all

root@tgcs017:/# ls -l /usr/local/nagios/libexec/
total 7884
-rwxr-xr-x 1 nagios nagios 246464 Jul 22  2016 check_apt
-rwxr-xr-x 1 nagios nagios   2309 Jul 22  2016 check_breeze
-rwxr-xr-x 1 nagios nagios 253584 Jul 22  2016 check_by_ssh
lrwxrwxrwx 1 root   root        9 Jul 22  2016 check_clamd -> check_tcp
-rwxr-xr-x 1 nagios nagios 187936 Jul 22  2016 check_cluster
-r-sr-xr-x 1 root   nagios 255120 Jul 22  2016 check_dhcp
-rwxr-xr-x 1 nagios nagios 242912 Jul 22  2016 check_dig
-rwxr-xr-x 1 nagios nagios 267520 Jul 22  2016 check_disk
-rwxr-xr-x 1 nagios nagios   9348 Jul 22  2016 check_disk_smb
-rwxr-xr-x 1 nagios nagios 266856 Jul 22  2016 check_dns
-rwxr-xr-x 1 nagios nagios 147808 Jul 22  2016 check_dummy
-rwxr-xr-x 1 nagios nagios   3542 Jul 22  2016 check_file_age
-rwxr-xr-x 1 nagios nagios   6375 Jul 22  2016 check_flexlm
lrwxrwxrwx 1 root   root        9 Jul 22  2016 check_ftp -> check_tcp
-rwxr-xr-x 1 nagios nagios 330600 Jul 22  2016 check_http
-r-sr-xr-x 1 root   nagios 262744 Jul 22  2016 check_icmp
-rwxr-xr-x 1 nagios nagios 199736 Jul 22  2016 check_ide_smart
-rwxr-xr-x 1 nagios nagios  15238 Jul 22  2016 check_ifoperstatus
-rwxr-xr-x 1 nagios nagios  13386 Jul 22  2016 check_ifstatus
lrwxrwxrwx 1 root   root        9 Jul 22  2016 check_imap -> check_tcp
-rwxr-xr-x 1 nagios nagios   6947 Jul 22  2016 check_ircd
-rwxr-xr-x 1 nagios nagios 213000 Jul 22  2016 check_load
-rwxr-xr-x 1 nagios nagios   6595 Jul 22  2016 check_log
-rwxr-xr-x 1 nagios nagios  22730 Jul 22  2016 check_mailq
-rwxr-xr-x 1 root   root     3420 Aug 19  2016 check_mem.sh
-rwxr-xr-x 1 nagios nagios 200672 Jul 22  2016 check_mrtg
-rwxr-xr-x 1 nagios nagios 202248 Jul 22  2016 check_mrtgtraf
-rwxr-xr-x 1 nagios nagios 213280 Jul 22  2016 check_nagios
lrwxrwxrwx 1 root   root        9 Jul 22  2016 check_nntp -> check_tcp
-rwxr-xr-x 1 root   root    22960 Sep  6  2016 check_nrpe
-rwxr-xr-x 1 nagios nagios 257096 Jul 22  2016 check_nt
-rwxr-xr-x 1 nagios nagios 259464 Jul 22  2016 check_ntp
-rwxr-xr-x 1 nagios nagios 249352 Jul 22  2016 check_ntp_peer
-rwxr-xr-x 1 nagios nagios 238664 Jul 22  2016 check_ntp_time
-rwxr-xr-x 1 nagios nagios 289144 Jul 22  2016 check_nwstat
-rwxr-xr-x 1 nagios nagios   8926 Jul 22  2016 check_oracle
-rwxr-xr-x 1 nagios nagios 224184 Jul 22  2016 check_overcr
-rwxr-xr-x 1 nagios nagios 257328 Jul 22  2016 check_ping
lrwxrwxrwx 1 root   root        9 Jul 22  2016 check_pop -> check_tcp
-rwxr-xr-x 1 nagios nagios 257352 Jul 22  2016 check_procs
-rwxr-xr-x 1 nagios nagios 220752 Jul 22  2016 check_real
-rwxr-xr-x 1 nagios nagios   9642 Jul 22  2016 check_rpc
-rwxr-xr-x 1 nagios nagios   1465 Jul 22  2016 check_sensors
-rwxr-xr-x 1 nagios nagios 252624 Jul 22  2016 check_smtp
-rwxr-xr-x 1 nagios nagios 218080 Jul 22  2016 check_ssh
-rwxr-xr-x 1 nagios nagios 195096 Jul 22  2016 check_swap
-rwxr-xr-x 1 nagios nagios 237864 Jul 22  2016 check_tcp
-rwxr-xr-x 1 nagios nagios 219208 Jul 22  2016 check_time
lrwxrwxrwx 1 root   root        9 Jul 22  2016 check_udp -> check_tcp
-rwxr-xr-x 1 nagios nagios 235624 Jul 22  2016 check_ups
-rwxr-xr-x 1 nagios nagios 186760 Jul 22  2016 check_uptime
-rwxr-xr-x 1 nagios nagios 185040 Jul 22  2016 check_users
-rwxr-xr-x 1 nagios nagios   2995 Jul 22  2016 check_wave
-rwxr-xr-x 1 nagios nagios 183256 Jul 22  2016 negate
-rwxr-xr-x 1 nagios nagios 175736 Jul 22  2016 urlize
-rwxr-xr-x 1 nagios nagios   1900 Jul 22  2016 utils.pm
-rwxr-xr-x 1 nagios nagios   2791 Jul 22  2016 utils.sh
root@tgcs017:/# su nagios
nagios@tgcs017:/$ /usr/local/nagios/libexec/check_load -w 5,10,15 -c 6,11,17
OK - load average: 0.03, 0.01, 0.00|load1=0.030;5.000;6.000;0; load5=0.010;10.000;11.000;0; load15=0.000;15.000;17.000;0;
On my Cent Os Nagios Logserver

Code: Select all

[root@tgcs018 ~]# ls -l /usr/local/nagios/libexec
total 6920
-rwxr-xr-x. 1 root nagios 204589 Oct 22 20:03 check_apt
-rwxr-xr-x. 1 root nagios   6897 Oct 22 20:03 check_asterisk.pl
-rwxr-xr-x. 1 root nagios   1978 Oct 22 20:03 check_asterisk_sip_peers.sh
-rwxr-xr-x. 1 root nagios   2242 Oct 22 20:03 check_breeze
-rwxr-xr-x. 1 root nagios 208710 Oct 22 20:03 check_by_ssh
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_clamd -> check_tcp
-rwxr-xr-x. 1 root nagios 153326 Oct 22 20:03 check_cluster
-rwxr-xr-x. 1 root nagios   5582 Oct 22 20:03 check_cpu_stats.sh
-rwsr-xr-x. 1 root nagios 201815 Oct 22 20:03 check_dhcp
-rwxr-xr-x. 1 root nagios 200544 Oct 22 20:03 check_dig
-rwxr-xr-x. 1 root nagios 221501 Oct 22 20:03 check_disk
-rwxr-xr-x. 1 root nagios   9289 Oct 22 20:03 check_disk_smb
-rwxr-xr-x. 1 root nagios 218578 Oct 22 20:03 check_dns
-rwxr-xr-x. 1 root nagios 117484 Oct 22 20:03 check_dummy
-rwxr-xr-x. 1 root nagios   3349 Oct 22 20:03 check_file_age
-rwxr-xr-x. 1 root nagios   6315 Oct 22 20:03 check_flexlm
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_ftp -> check_tcp
-rwxr-xr-x. 1 root nagios 337459 Oct 22 20:03 check_http
-rwsr-xr-x. 1 root nagios 213886 Oct 22 20:03 check_icmp
-rwxr-xr-x. 1 root nagios 164240 Oct 22 20:03 check_ide_smart
-rwxr-xr-x. 1 root nagios  15123 Oct 22 20:03 check_ifoperstatus
-rwxr-xr-x. 1 root nagios  12600 Oct 22 20:03 check_ifstatus
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_imap -> check_tcp
-rwsr-xr-x. 1 root nagios    972 Oct 22 20:03 check_init_service
-rwxr-xr-x. 1 root nagios   6887 Oct 22 20:03 check_ircd
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_jabber -> check_tcp
-rwxr-xr-x. 1 root nagios 177925 Oct 22 20:03 check_load
-rwxr-xr-x. 1 root nagios   5981 Oct 22 20:03 check_log
-rwxr-xr-x. 1 root nagios  21480 Oct 22 20:03 check_mailq
-rwxr-xr-x. 1 root nagios 163466 Oct 22 20:03 check_mrtg
-rwxr-xr-x. 1 root nagios 163311 Oct 22 20:03 check_mrtgtraf
-rwxr-xr-x. 1 root nagios 176757 Oct 22 20:03 check_nagios
-rwxr-xr-x. 1 root nagios  25602 Oct 22 20:03 check_netstat.pl
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_nntp -> check_tcp
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_nntps -> check_tcp
-rwxr-xr-x. 1 root nagios  81542 Oct 22 20:03 check_nrpe
-rwxr-xr-x. 1 root nagios 209783 Oct 22 20:03 check_nt
-rwxr-xr-x. 1 root nagios 213326 Oct 22 20:03 check_ntp
-rwxr-xr-x. 1 root nagios 202383 Oct 22 20:03 check_ntp_peer
-rwxr-xr-x. 1 root nagios 197548 Oct 22 20:03 check_ntp_time
-rwxr-xr-x. 1 root nagios 240324 Oct 22 20:03 check_nwstat
-rwxr-xr-x. 1 root nagios   3259 Oct 22 20:03 check_open_files.pl
-rwxr-xr-x. 1 root nagios   8779 Oct 22 20:03 check_oracle
-rwxr-xr-x. 1 root nagios 183078 Oct 22 20:03 check_overcr
-rwxr-xr-x. 1 root nagios 213331 Oct 22 20:03 check_ping
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_pop -> check_tcp
-rwxr-xr-x. 1 root nagios 213268 Oct 22 20:03 check_procs
-rwxr-xr-x. 1 root nagios 179948 Oct 22 20:03 check_real
-rwxr-xr-x. 1 root nagios   9581 Oct 22 20:03 check_rpc
-rwxr-xr-x. 1 root nagios   1453 Oct 22 20:03 check_sensors
-rwxr-xr-x. 1 root nagios   2174 Oct 22 20:03 check_services
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_simap -> check_tcp
-rwxr-xr-x. 1 root nagios   7599 Oct 22 20:03 check_sip
-rwxr-xr-x. 1 root nagios 270782 Oct 22 20:03 check_smtp
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_spop -> check_tcp
-rwxr-xr-x. 1 root nagios 179435 Oct 22 20:03 check_ssh
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_ssmtp -> check_tcp
-rwxr-xr-x. 1 root nagios 161242 Oct 22 20:03 check_swap
-rwxr-xr-x. 1 root nagios 256129 Oct 22 20:03 check_tcp
-rwxr-xr-x. 1 root nagios 181134 Oct 22 20:03 check_time
lrwxrwxrwx. 1 root root        9 Oct 22 20:03 check_udp -> check_tcp
-rwxr-xr-x. 1 root nagios 193702 Oct 22 20:03 check_ups
-rwxr-xr-x. 1 root nagios 153948 Oct 22 20:03 check_uptime
-rwxr-xr-x. 1 root nagios 152382 Oct 22 20:03 check_users
-rwxr-xr-x. 1 root nagios   2936 Oct 22 20:03 check_wave
-rwxr-xr-x. 1 root nagios    710 Oct 22 20:03 check_yum
-rwxr-xr-x. 1 root nagios   3435 Oct 22 20:03 custom_check_mem
-rwxr-xr-x. 1 root nagios    915 Oct 22 20:03 custom_check_procs
-rwxr-xr-x. 1 root nagios   4176 Oct 22 20:03 nagisk.pl
-rwxr-xr-x. 1 root nagios 148385 Oct 22 20:03 negate
-rwxr-xr-x. 1 root nagios  73375 Oct 22 20:03 send_nsca
-rwxr-xr-x. 1 root nagios 145849 Oct 22 20:03 urlize
-rwxr-xr-x. 1 root nagios   1878 Oct 22 20:03 utils.pm
-rwxr-xr-x. 1 root nagios   2791 Oct 22 20:03 utils.sh
[root@tgcs018 ~]# su nagios
[nagios@tgcs018 root]$ /usr/local/nagios/libexec/check_load -w 5,10,15 -c 6,11,17
OK - load average: 0.02, 0.02, 0.05|load1=0.020;5.000;6.000;0; load5=0.020;10.000;11.000;0; load15=0.050;15.000;17.000;0;
Last edited by tgriep on Fri Mar 17, 2017 3:46 pm, edited 1 time in total.
Reason: Added Code Wraps around large output.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: All Linux Server CPU Spike at same time

Post by tgriep »

Can you run the same commands on this server?
10.2.8.7 SUSE Enterprise vMA
That was the server that was generating the error from your previous post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: All Linux Server CPU Spike at same time

Post by kwhogster »

Cant run that from that server locally

tgkw002:/usr/local/nagios/libexec # ls -l
total 324
-rwxr-xr-x 1 vi-admin root 177925 Mar 17 12:13 check_load
-rwxr-xr-x 1 vi-admin root 3419 Aug 19 2016 check_mem.sh
-rwxrwxr-x 1 nagios nagios 137337 Aug 16 2016 check_nrpe
tgkw002:/usr/local/nagios/libexec # check_load -w 5,10,15 -c 6,11,17
-bash: check_load: command not found
tgkw002:/usr/local/nagios/libexec # ls
check_load check_mem.sh check_nrpe
tgkw002:/usr/local/nagios/libexec # /usr/local/nagios/libexec/check_load -w 5,10,15 -c 6,11,17
/usr/local/nagios/libexec/check_load: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /usr/local/nagios/libexec/check_load)
tgkw002:/usr/local/nagios/libexec #


Also the other two that posted before have the same problem

Remember it is all my Linux servers that have this issue not just 10.2.8.7 they all do.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: All Linux Server CPU Spike at same time

Post by dwhitfield »

On all the servers where you can run it as root, please paste the output for for user in $(cut -f1 -d: /etc/passwd); do echo $user; crontab -u $user -l; done.

Please put the outputs in individual code blocks and label each one so we can tell them apart.

Also, I am a bit confused because you cannot run that command, but then you run it. Do you mean you cannot run it as root?
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: All Linux Server CPU Spike at same time

Post by kwhogster »

Your confused LOL

I am very confused by all the requests for information on this

This is NAGIOS casuing the problem

The question here is Why does all the LINUX machines spike at the same time that is almost impossible to happen this is NAGIOS issue for sure

Not one time did anyone ask about the config s Oh yes they did I sent them all the configs.

Why am I getting some many questions and ask for this and ask for that not one suggestion on what to change

Yes I can issue the command as root.

I listed two of the 4 machines with results that should be enough


I am not a LINUX guy better name for me is a LINUX HACK

This is getting beyond my pay level LOL



I ran the command on the Nagios server no luck

root@tgcs017:/usr/local/nagios/etc/objects# $(cut -fl -d: /etc/passwd); do echo $user; crontab -u $user -l; done.
-su: syntax error near unexpected token `do'
root@tgcs017:/usr/local/nagios/etc/objects#

syntax?
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: All Linux Server CPU Spike at same time

Post by dwhitfield »

kwhogster wrote: root@tgcs017:/usr/local/nagios/etc/objects# $(cut -fl -d: /etc/passwd); do echo $user; crontab -u $user -l; done.
I suggested you run for user in $(cut -f1 -d: /etc/passwd); do echo $user; crontab -u $user -l; done

You left off "for user in"

Also, what's the outout of echo $0?

Good thing you are on the Nagios forums, since this is a Nagios issue.
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: All Linux Server CPU Spike at same time

Post by kwhogster »

Ok thanks sorry for the frustration.


I did all that on all 4 servers as root.

Attached them in separate text files see attached

Could only attach three will post four in a bit
Attachments
TGKW002 VMA Server.txt
TGKW002 VMA Server
(992 Bytes) Downloaded 363 times
TGCS018 Nagios Logserver.txt
TGCS018 Nagios Logserver
(1.06 KiB) Downloaded 350 times
TGCS017 Nagios Server.txt
TGCS017 Nagios Server
(2.25 KiB) Downloaded 362 times
kwhogster
Posts: 644
Joined: Wed Oct 14, 2015 6:51 pm
Location: Wood Ridge NJ USA
Contact:

Re: All Linux Server CPU Spike at same time

Post by kwhogster »

Here is the four Linux server
Attachments
Raspberry Pi Test Nagios Server.txt
Raspberry PI Test Nagios Server
(2.02 KiB) Downloaded 364 times
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: All Linux Server CPU Spike at same time

Post by rkennedy »

kwhogster wrote:Your confused LOL

I am very confused by all the requests for information on this

This is NAGIOS casuing the problem

The question here is Why does all the LINUX machines spike at the same time that is almost impossible to happen this is NAGIOS issue for sure
Are you serious right now? All of your linux machines are NOT spiking at the same time. YOU initially configured ALL of your checks to use check_local_load.

I haven't seen in this post if the nagios user (or whoever you're running NRPE under) is able to actually run your script locally. Give that a try and get it working, then watch your /var/log/messages on the client machine when running the checks through NRPE AFTER you get it working locally.
Former Nagios Employee
Locked