Page 2 of 2

Re: Current Load on Localhost

Posted: Mon Oct 18, 2021 2:54 am
by lanxessinfy
HI,

I did the same as you were told.
I ran the passive check for Current load service

Please find the following logs of cat /usr/local/nagios/var/nagios.log | grep -Ei 'current load' -A 5 -B 1

[1634542781] SERVICE NOTIFICATION: nagiosadmin;localhost;Current Load;CRITICAL;xi_service_notification_handler;test
[1634542781] SERVICE NOTIFICATION: LVNYI;localhost;Current Load;CRITICAL;notify_service_xi_contact_copy_1;test
[1634542781] SERVICE ALERT: localhost;Current Load;CRITICAL;HARD;1;test
[1634542781] GLOBAL SERVICE EVENT HANDLER: localhost;Current Load;CRITICAL;HARD;1;xi_service_event_handler
[1634542781] SERVICE EVENT HANDLER: localhost;Current Load;CRITICAL;HARD;1;Service Restart - Linux
[1634542782] wproc: SERVICE EVENTHANDLER job 169 from worker Core Worker 959 is a non-check helper but exited with return code 1
[1634542782] wproc: early_timeout=0; exited_ok=1; wait_status=256; error_code=0;
[1634542782] wproc: stderr line 01: Failed to restart httpd.service: Interactive authentication required.
[1634542782] wproc: stderr line 02: See system logs and 'systemctl status httpd.service' for details.
[1634542788] wproc: Core Worker 958: job 58 (pid=2073) timed out. Killing it


Log info from /tmp/hostinfo.txt

[root@LAZURENGMO1VP tmp]# cat hostinfo.txt
top - 09:39:42 up 5 days, 4:32, 1 user, load average: 0.54, 0.37, 0.39
Tasks: 289 total, 11 running, 276 sleeping, 0 stopped, 2 zombie
%Cpu(s): 36.5 us, 19.0 sy, 0.0 ni, 44.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 32946500 total, 10986128 free, 2398156 used, 19562216 buff/cache
KiB Swap: 2097148 total, 2097148 free, 0 used. 29636468 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3635 nagios 20 0 443304 30140 11316 S 60.0 0.1 0:00.09 php
3625 root 20 0 0 0 0 Z 26.7 0.0 0:00.25 falcon-fx
17387 root 20 0 0 0 0 D 13.3 0.0 0:00.58 kworker/u8+
1066 root 20 0 495872 21360 17444 S 6.7 0.1 1:15.30 rsyslogd
1568 mysql 20 0 2645404 898960 10588 S 6.7 2.7 376:23.58 mysqld
3410 root 20 0 87028 22096 5252 R 6.7 0.1 0:00.37 rpm
3736 root 10 -10 84948 24824 3780 S 6.7 0.1 26:31.79 microsoft-+
4470 root 20 0 0 0 0 R 6.7 0.0 26:38.23 kcs-evbsyn+
4482 root 20 0 0 0 0 R 6.7 0.0 82:44.40 kcs-term
4483 root 20 0 0 0 0 S 6.7 0.0 79:25.91 kcs-created
1 root 20 0 191268 4344 2652 S 0.0 0.0 5:41.05 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.27 kthreadd
4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:+
Mon Oct 18 09:39:42 CEST 2021


Please help us.

Re: Current Load on Localhost

Posted: Mon Oct 18, 2021 11:40 am
by pbroste
Hello @lanxessinfy

Thanks for following up with the details, we know that the event handler is going out to restart the script but as we can see it is getting hung up on permissions. During additional testing from my test VM we added the following to the '/etc/sudoers':

Code: Select all

nagios ALL = NOPASSWD: /bin/systemctl restart httpd
Then added superuser permissions to your httpd restart script which has nagios group and nagios user permissions.
sudo systemctl restart httpd.service
Thanks,
Perry

Re: Current Load on Localhost

Posted: Tue Oct 19, 2021 2:55 am
by lanxessinfy
Hi,

Please find the logs:
cat /usr/local/nagios/var/nagios.log | grep -Ei 'current load' -A 5 -B 1


[1634629813] GLOBAL SERVICE EVENT HANDLER: USTHXNAPP02;CPU Usage;CRITICAL;SOFT;1;xi_service_event_handler
[1634629866] SERVICE NOTIFICATION: nagiosadmin;localhost;Current Load;CRITICAL;xi_service_notification_handler;Test
[1634629866] SERVICE NOTIFICATION: LVNYI;localhost;Current Load;CRITICAL;notify_service_xi_contact_copy_1;Test
[1634629866] SERVICE ALERT: localhost;Current Load;CRITICAL;HARD;1;Test
[1634629866] GLOBAL SERVICE EVENT HANDLER: localhost;Current Load;CRITICAL;HARD;1;xi_service_event_handler
[1634629866] SERVICE EVENT HANDLER: localhost;Current Load;CRITICAL;HARD;1;Service Restart - Linux
[1634629868] wproc: SERVICE EVENTHANDLER job 130640 from worker Core Worker 24974 is a non-check helper but exited with return code 1
[1634629868] wproc: early_timeout=0; exited_ok=1; wait_status=256; error_code=0;
[1634629868] wproc: stderr line 01:
[1634629868] wproc: stderr line 02: We trust you have received the usual lecture from the local System
[1634629868] wproc: stderr line 03: Administrator. It usually boils down to these three things:
[root@LAZURENGMO1VP nagios]# grep -Ei 'Starting The Apache HTTP Server' /var/log/messages
[root@LAZURENGMO1VP nagios]#


I think the server has not restarted.

Please provide us the solution.

Thanks!

Re: Current Load on Localhost

Posted: Tue Oct 19, 2021 1:23 pm
by pbroste
Hello @lanxessinfy

Previously;
pbroste wrote:Hello @lanxessinfy

Thanks for following up with the details, we know that the event handler is going out to restart the script but as we can see it is getting hung up on permissions. During additional testing from my test VM we added the following to the '/etc/sudoers':

Code: Select all

nagios ALL = NOPASSWD: /bin/systemctl restart httpd
Then added superuser permissions to your httpd restart script which has nagios group and nagios user permissions.
sudo systemctl restart httpd.service
Thanks,
Perry
The User nagios in /etc/sudoers; verified to work on my test VM. I did not specify to use the alias, and want to update your '/etc/sudoers' to reflect the alias by adding the following to your '/etc/sudoers' file.

Code: Select all

NAGIOSXI ALL = NOPASSWD: /bin/systemctl restart httpd.service
Please verify that the 'service_restart.sh' script works by enter command(s):

Code: Select all

su -l nagios
Now User login looks like:
[nagios@localhost ~]$
Run 'service_restart.sh' script: (in my VM test environment I moved the service_restart.sh script to '/home/nagios' and made sure permissions were correct)

Code: Select all

bash -x service_restart.sh CRITICAL
Results should look like:
bash -x service_restart.sh CRITICAL
+ SERVICESTATE=CRITICAL
+ case "$SERVICESTATE" in
+ top -b -n 1
+ head -20
+ date
+ sudo systemctl restart httpd.service
I see that we will need to make sure that '/tmp/hostinfo.txt' is accessible by nagios user and group:

Code: Select all

chown nagios:nagios /tmp/hostinfo.txt
Thanks,
Perry

Re: Current Load on Localhost

Posted: Thu Oct 21, 2021 9:30 am
by lanxessinfy
Hi,

Thank you so much.

we are now able to do it.

Any idea about what is this kcs-evbsync/0, kcs-evbsync/1, kcs-evbsync/2, kcs-evbsync/3 process.
They are using high cpu.

Thanks again!

Re: Current Load on Localhost

Posted: Thu Oct 21, 2021 1:23 pm
by pbroste
Hello @lanxessinfy

Excellent, glad to hear that things are dialed in.

The 'kcs-<..xxx..>' processes are a "cslookaside" third-party kernel module that is included with Red Hat Distro's. It is a CrowdStrike security kernel module. If it causes problems you can disable it.

Thanks,
Perry