Nagioxi CPU consumption strongly increased

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagioxi CPU consumption strongly increased

Post by tgriep »

Could you PM me your System Profile so I can view it? There may be more information in it to help troubleshoot the issue.
To send us your system profile. login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and PM it to me.

Also, run the following and PM the output as well.

Code: Select all

ipcs -q
df -h
df -i
top -n1
After I receive the info, I'll let you know the next step to do.
Be sure to check out our Knowledgebase for helpful articles and solutions!
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Nagioxi CPU consumption strongly increased

Post by tmcdonald »

In regards to a remote session, if you would like one we need to move this to an email ticket. To do so, please email [email protected] with a link to this thread and a brief description of the issue. If you are going to do this, please first respond to @tgriep's post with the information requested.

Please note however that I cannot guarantee a remote at a specific time due to our timezone differences and office hours. We'll make ourselves reasonably available within those times.
Former Nagios employee
Frédéric GRANAT
Posts: 445
Joined: Mon Nov 19, 2012 11:36 am

Re: Nagioxi CPU consumption strongly increased

Post by Frédéric GRANAT »

Hi,

Could you PM me your System Profile so I can view it?
=> Attached to that mail (It took 15 minutes to navigate to the system profile page)

Also, run the following and PM the output as well.
=> Here it is :

Code: Select all

[root@nagiosxi mail]# ipcs -q

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages
0xeb000002 327680     nagios     600        0            0

[root@nagiosxi mail]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       28G   24G  2.6G  91% /
/dev/sda1              99M   31M   63M  34% /boot
tmpfs                 1.5G     0  1.5G   0% /dev/shm
[root@nagiosxi mail]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00
                     7569408  648392 6921016    9% /
/dev/sda1              26104      53   26051    1% /boot
tmpfs                 219735       1  219734    1% /dev/shm
[root@nagiosxi mail]# top -n1
top - 17:10:06 up 51 days,  8:22,  1 user,  load average: 23.20, 28.64, 29.76
Tasks: 255 total,  13 running, 240 sleeping,   0 stopped,   2 zombie
Cpu(s): 19.5%us,  6.5%sy,  0.1%ni, 73.8%id,  0.1%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   3107100k total,  2547180k used,   559920k free,   207240k buffers
Swap:  1048568k total,    36468k used,  1012100k free,  1388920k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11693 root      25   0 20776  13m 2388 S 59.1  0.5   0:00.96 mrtg
15560 nagios    25   0  3120  892  592 R 33.5  0.0 760:56.29 nagios
21104 apache    16   0 66964  25m 5412 S 31.5  0.9   2:30.16 httpd
15556 nagios    25   0  3120  904  592 R 21.7  0.0 778:49.51 nagios
19033 postgres  16   0 23012  11m  10m S 17.7  0.4   2:45.24 postmaster
 2247 apache    21   0 67252  25m 5316 R 13.8  0.9   0:04.62 httpd
13831 postgres  16   0 23160  12m  10m R 13.8  0.4   3:18.26 postmaster
21345 postgres  15   0 23160  12m  10m S 13.8  0.4   1:10.12 postmaster
21323 postgres  16   0 21992  11m  10m R 11.8  0.4  20:38.92 postmaster
 9591 postgres  16   0 23160  12m  10m R  9.9  0.4   0:07.00 postmaster
11743 nagios    17   0  8164 4076 1720 R  9.9  0.1   0:00.05 check_wmi_plus.
 2393 postgres  15   0 23160  12m  10m R  7.9  0.4   0:35.15 postmaster
 2339 postgres  15   0 21992  11m  10m S  5.9  0.4   0:36.23 postmaster
 9110 apache    15   0 67252  26m 5400 S  2.0  0.9   0:25.10 httpd
 9236 postgres  15   0 21992  11m  10m S  2.0  0.4   3:37.19 postmaster
15564 nagios    15   0 20472 2528 1088 S  2.0  0.1 102:04.57 ndo2db
    1 root      15   0  2168  668  576 S  0.0  0.0   0:05.97 init
[root@nagiosxi mail]#
Last edited by tgriep on Thu Sep 08, 2016 10:40 am, edited 1 time in total.
Reason: Removed the profile
Frédéric GRANAT
Posts: 445
Joined: Mon Nov 19, 2012 11:36 am

Re: Nagioxi CPU consumption strongly increased

Post by Frédéric GRANAT »

I forgot to send you the profile file by MP.
Please delete it from the post once downloaded.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagioxi CPU consumption strongly increased

Post by tgriep »

It looks like the Postgress database is taking up the most of the performance of the server and it may need to be vacuumed and restarted to fix this.
In this KB article, about half way down are the processes for vacuuming the Postgres SQL database.
https://support.nagios.com/kb/article.php?id=25

Depending on the version of Postgres, there are a few ways to run the vacuum. If the first one fails, try the second method then the third.

It looks like there are a lot of left over lock files for the MRTG process that gathers the Bandwidth information for switches and routers.
Run the following to remove them as they are not needed.
rm -fr /etc/mrtg/mrtg.cfg_l_*

Also, the performance problem caused the performance graphs from running and the files need to be deleted as well.
Run the following to do that.

Code: Select all

for f in /usr/local/nagios/var/spool/xidpe/*; do rm -f $f; done
After all of the steps have been done, please reboot the server.
Let us know if this helps out the performance of the server.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Frédéric GRANAT
Posts: 445
Joined: Mon Nov 19, 2012 11:36 am

Re: Nagioxi CPU consumption strongly increased

Post by Frédéric GRANAT »

I ran :

Code: Select all

service postgresql stop
su postgres
echo "VACUUM FULL;" > /tmp/fix.sql
postgres -D /var/lib/pgsql/data nagiosxi < /tmp/fix.sql
postgres -D /var/lib/pgsql/data postgres < /tmp/fix.sql
postgres -D /var/lib/pgsql/data template1 < /tmp/fix.sql
postgres -D /var/lib/pgsql/data nagiosfusion < /tmp/fix.sql
exit
service postgresql start
=> No Improvment
rm -fr /etc/mrtg/mrtg.cfg_l_*
=> No Improvment
for f in /usr/local/nagios/var/spool/xidpe/*; do rm -f $f; done
=> No Improvment
please reboot the server.
=> No Improvment
Last edited by Frédéric GRANAT on Fri Sep 09, 2016 9:24 am, edited 1 time in total.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagioxi CPU consumption strongly increased

Post by tgriep »

Do you want to continue to continue to trouble shoot the problem here in the forum or in the ticket you opened?

Can you run the following and post the output here?

Code: Select all

top -n1
Then, can you post the following files so I can view them?

Code: Select all

/var/lib/pgsql/data/pg_log/postgresql-Fri.log
/etc/php.ini
Be sure to check out our Knowledgebase for helpful articles and solutions!
Frédéric GRANAT
Posts: 445
Joined: Mon Nov 19, 2012 11:36 am

Re: Nagioxi CPU consumption strongly increased

Post by Frédéric GRANAT »

Do you want to continue to continue to trouble shoot the problem here in the forum or in the ticket you opened?
=> Since I have no news from mu ticket, it's OK to continue like that for the moment.

Can you run the following and post the output here?

Code: Select all

[root@nagiosxi ~]# top -n1
top - 16:25:19 up  1:44,  1 user,  load average: 3.73, 5.03, 6.11
Tasks: 208 total,   7 running, 197 sleeping,   0 stopped,   4 zombie
Cpu(s): 31.3%us, 17.1%sy,  0.0%ni, 51.1%id,  0.2%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   3107100k total,   965608k used,  2141492k free,   112392k buffers
Swap:  1048568k total,        0k used,  1048568k free,   437556k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4422 nagios    25   0  3080  828  592 R 63.5  0.0  10:57.37 nagios
 4425 nagios    25   0  3124  892  592 R 63.5  0.0  18:20.92 nagios
 4421 nagios    25   0  3120  900  592 R 36.6  0.0  15:59.77 nagios
 4426 nagios    25   0  3132  856  592 R 32.7  0.0  13:16.35 nagios
 4423 nagios    25   0  3120  856  592 R 28.9  0.0  12:40.79 nagios
 4424 nagios    25   0  3080  836  592 R 26.9  0.0  14:47.79 nagios
    1 root      15   0  2168  668  576 S  0.0  0.0   0:01.29 init
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.05 migration/0
    3 root      35  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.08 migration/1
    5 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/1
    6 root      RT  -5     0    0    0 S  0.0  0.0   0:00.05 migration/2
    7 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/2
    8 root      RT  -5     0    0    0 S  0.0  0.0   0:00.09 migration/3
    9 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/3
   10 root      10  -5     0    0    0 S  0.0  0.0   0:00.03 events/0
   11 root      10  -5     0    0    0 S  0.0  0.0   0:00.01 events/1
[root@nagiosxi ~]#

I attached the two files
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagioxi CPU consumption strongly increased

Post by tgriep »

I did reply to the ticket yesterday around 11:11am CST. Strange that you didn't get it.

I see a max connection error in the postgres log file. To fix that, edit this file

Code: Select all

/var/lib/pgsql/data/postgresql.conf
Find this line and increase it. Try doubling it.

Code: Select all

max_connections = 100
Then restart the postgres database by running

Code: Select all

service postgresql restart
If you do find the email from yesterday, could you reply to it and attach a new system profile?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Frédéric GRANAT
Posts: 445
Joined: Mon Nov 19, 2012 11:36 am

Re: Nagioxi CPU consumption strongly increased

Post by Frédéric GRANAT »

Hi,
I increased the max_connections, no positive result.

Frederic
Locked