Page 2 of 5
Re: Nagiosxi does not show availability report
Posted: Wed Jan 13, 2016 2:54 pm
by jonathan.cruz
Now all my components was down.
[root@Myserver~]# running, 1612 sleeping, 0 stopped, 0 zombie
Cpu(s): 23.3%us, 4.4%sy, 0.0%ni, 70.2%id, 1.9%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 16326596k total, 10013100k used, 6313496k free, 158216k buffers
Swap: 2064380k total, 23052k used, 2041328k free, 7104536k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
59580 nagios 20 0 235m 69m 4688 R 98.8 0.4 0:06.66 check_esx3.pl
59719 nagios 20 0 252m 86m 4688 R 98.8 0.5 0:09.53 check_esx3.pl
62002 root 20 0 16236 2392 884 R 10.6 0.0 0:00.11 top
62011 nagios 20 0 110m 2972 1688 S 5.3 0.0 0:00.03 snmpwalk
62017 nagios 20 0 110m 2968 1688 S 5.3 0.0 0:00.03 snmpwalk
62023 nagios 20 0 110m 2968 1688 S 5.3 0.0 0:00.03 snmpwalk
62027 nagios 20 0 110m 2972 1688 S 5.3 0.0 0:00.03 snmpwalk
62030 nagios 20 0 110m 2968 1688 S 5.3 0.0 0:00.03 snmpwalk
62033 nagios 20 0 110m 2964 1688 S 5.3 0.0 0:00.03 snmpwalk
62056 nagios 20 0 110m 2972 1688 S 5.3 0.0 0:00.03 snmpwalk
62074 nagios 20 0 110m 2964 1688 S 5.3 0.0 0:00.03 snmpwalk
62094 nagios 20 0 110m 2972 1688 S 5.3 0.0 0:00.03 snmpwalk
62097 nagios 20 0 110m 2972 1688 S 5.3 0.0 0:00.03 snmpwalk
62103 nagios 20 0 110m 2968 1688 S 5.3 0.0 0:00.03 snmpwalk
62107 nagios 20 0 110m 2968 1688 S 5.3 0.0 0:00.03 snmpwalk
62111 nagios 20 0 110m 2972 1688 S 5.3 0.0 0:00.03 snmpwalk
62114 nagios 20 0 110m 2964 1688 S 5.3 0.0 0:00.03 snmpwalk
62121 nagios 20 0 110m 2964 1688 S 5.3 0.0 0:00.03 snmpwalk
Re: Nagiosxi does not show availability report
Posted: Wed Jan 13, 2016 2:57 pm
by jonathan.cruz
Jan 13 17:56:01 MYSERVER crond[19632]: (nagios) ERROR (failed to change user)
Jan 13 17:56:01 MYSERVER crond[19634]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 17:56:01 MYSERVER crond[19634]: (nagios) ERROR (failed to change user)
Jan 13 17:56:01 MYSERVER crond[19635]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 17:56:01 MYSERVER crond[19635]: (nagios) ERROR (failed to change user)
Jan 13 17:56:01 MYSERVER crond[19633]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 17:56:01 MYSERVER crond[19633]: (nagios) ERROR (failed to change user)
Jan 13 17:56:01 MYSERVER crond[19637]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 17:56:01 MYSERVER crond[19637]: (nagios) ERROR (failed to change user)
Jan 13 17:56:01 MYSERVER crond[19638]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 17:56:01 MYSERVER crond[19638]: (nagios) ERROR (failed to change user)
Jan 13 17:56:01 MYSERVER crond[19636]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 17:56:01 MYSERVER crond[19636]: (nagios) ERROR (failed to change user)
Jan 13 17:56:01 MYSERVER crond[19639]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 17:56:01 MYSERVER crond[19639]: (nagios) ERROR (failed to change user)
Re: Nagiosxi does not show availability report
Posted: Wed Jan 13, 2016 3:18 pm
by jolson
Lets check on your security limits, we may need to raise them:
Is cron running anymore?
Check out this post in the CentOS forum:
https://www.centos.org/forums/viewtopic.php?t=27016
"This is commonly caused by running out of file descriptors.
There are different file descriptor limits.
There is the systems total file descriptor limit, what do you get from the command:
From this command you will get 3 numbers. First is the number of used file descriptor the second is the number of allocated but not used file descriptor and the last is the system max file descriptor. If either of the first two numbers are new otr at the third you need to increase the number of file descriptors for the system of find out what is consuming them.
If the total of the used system file descriptors is not near the max it may be a user limit.
To find out what a users file descriptor limit is run the commands:
Replace UID with the user ID is the user you want to check, or if you are already logged in as that user just run the ulimit command.
To find out how many file descripters are in use by a user run the command:
Code: Select all
sudo lsof -u nagios 2>/dev/null | wc -l
So now if you are having a system file descriptor limit issue you will need to edit your /etc/sysctl.conf file and add, or modify it it already exists, a line with fs.file-max and set it to a value large enough to deal with the number of file descriptors you need and reboot.
The line would look somehting like:
If it is a individual users file descriptor limit then you will have to update the users limits in the /etc/security/limits.conf file with an entry like:
Code: Select all
UID soft nofile 4096
UID hard nofile 10240
Once again you will have to replace UID with the user ID of the account with the issue."
Please let us know what you find out about the above. It's also alarming that check_esx3 is taking up so much CPU - how long do those processes stay running for?
Also, how many snmpwalks are running on the server?
Re: Nagiosxi does not show availability report
Posted: Wed Jan 13, 2016 8:37 pm
by jonathan.cruz
I have make all test and identify in my enviroment limit o max user process limit
ulimit -Sa
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127383
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 809993
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 2048
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
But after upgrade this limit i keep recive a erro Availability data is not available when monitoring engine is not running and keep recive erros in cron jobs..
Jan 13 23:35:01 MYSERVER CROND[20384]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc. log 2>&1)
Jan 13 23:35:01 MYSERVER CROND[20386]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1 )
Jan 13 23:35:01 MYSERVER CROND[20388]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine. log 2>&1)
Jan 13 23:35:01 MYSERVER CROND[20387]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2> &1)
Jan 13 23:35:01 MYSERVER CROND[20382]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1 )
Jan 13 23:35:01 MYSERVER CROND[20383]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 13 23:36:01 MYSERVER crond[34550]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34550]: (nagios) ERROR (failed to change user)
Jan 13 23:36:01 MYSERVER crond[34551]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34551]: (nagios) ERROR (failed to change user)
Jan 13 23:36:01 MYSERVER crond[34555]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34553]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34548]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34549]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34549]: (nagios) ERROR (failed to change user)
Jan 13 23:36:01 MYSERVER crond[34553]: (nagios) ERROR (failed to change user)
Jan 13 23:36:01 MYSERVER crond[34548]: (nagios) ERROR (failed to change user)
Jan 13 23:36:01 MYSERVER crond[34555]: (nagios) ERROR (failed to change user)
Jan 13 23:36:01 MYSERVER crond[34554]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34552]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:36:01 MYSERVER crond[34554]: (nagios) ERROR (failed to change user)
Jan 13 23:36:01 MYSERVER crond[34552]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52039]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52039]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52042]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52043]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52042]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52038]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52043]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52044]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52041]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52038]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52044]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52041]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52040]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52045]: (CRON) ERROR (setreuid failed): Resource temporarily unavailable
Jan 13 23:37:01 MYSERVER crond[52045]: (nagios) ERROR (failed to change user)
Jan 13 23:37:01 MYSERVER crond[52040]: (nagios) ERROR (failed to change user)
Re: Nagiosxi does not show availability report
Posted: Wed Jan 13, 2016 9:31 pm
by jonathan.cruz
After incrise max user process the cron to monitoring engine was running normaly but i stil reciving the message "Availability data is not available when monitoring engine is not running.".
# ls -ltr /usr/local/nagiosxi/var
-rw-r--r-- 1 nagios nagios 30110 Jan 14 00:01 recurringdowntime.log
drwxr-xr-x 2 nagios nagios 4096 Jan 14 00:11 subsys
-rw-r--r-- 1 nagios nagios 75 Jan 14 00:25 deadpool.log
-rw-r--r-- 1 nagios nagios 10606 Jan 14 00:27 dbmaint.log
-rw-r--r-- 1 nagios nagios 0 Jan 14 00:29 reportengine.log
-rw-r--r-- 1 nagios nagios 0 Jan 14 00:29 nom.log
-rw-r--r-- 1 nagios nagios 303 Jan 14 00:29 cleaner.log
-rw-r--r-- 1 nagios nagios 10138 Jan 14 00:29 corelog.diff
-rw-r--r-- 1 nagios nagios 7 Jan 14 00:29 corelog.data
-rw-r--r-- 1 nagios nagios 81 Jan 14 00:29 perfdataproc.log
-rw-r--r-- 1 nagios nagios 25 Jan 14 00:29 feedproc.log
-rw-r--r-- 1 nagios nagios 8222 Jan 14 00:29 sysstat.log
-rw-r--r-- 1 nagios nagios 44044 Jan 14 00:29 eventman.log
-rw-r--r-- 1 nagios nagios 80 Jan 14 00:29 cmdsubsys.log
tail -n200 /var/log/cron
Jan 14 00:29:01 MYSERVER CROND[13933]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 14 00:29:01 MYSERVER CROND[13937]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jan 14 00:29:01 MYSERVER CROND[13938]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 14 00:29:01 MYSERVER CROND[13939]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 14 00:29:01 MYSERVER CROND[13940]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 14 00:29:01 MYSERVER CROND[13941]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 14 00:29:01 MYSERVER CROND[13942]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 14 00:29:01 MYSERVER CROND[13952]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31418]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Jan 14 00:30:01 MYSERVER CROND[31422]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31425]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31423]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31419]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31426]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31427]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31424]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31428]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31430]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31420]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1)
Jan 14 00:30:01 MYSERVER CROND[31421]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /va r/lib/mrtg/mrtg.ok)
Re: Nagiosxi does not show availability report
Posted: Thu Jan 14, 2016 11:29 am
by jolson
but i stil reciving the message "Availability data is not available when monitoring engine is not running.".
Is this the only error that is left?
Check on a couple of things for us please:
Code: Select all
ps -ef | grep cron
cat /etc/cron.d/nagiosxi
Does cron show more running processes now? Anything notable in the cron log?
Re: Nagiosxi does not show availability report
Posted: Mon Jan 18, 2016 8:56 am
by jonathan.cruz
No erros on cron log. All jobs are Running.
Re: Nagiosxi does not show availability report
Posted: Mon Jan 18, 2016 11:24 am
by jonathan.cruz
But i keep reciving the message
Availability data is not available when monitoring engine is not running.
Re: Nagiosxi does not show availability report
Posted: Mon Jan 18, 2016 2:31 pm
by rkennedy
Can you post another screenshot of your 'XI System Component Page'?
Also, can you please verify your Nagioc configuration using this command, and post the output for us?
Code: Select all
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Re: Nagiosxi does not show availability report
Posted: Tue Jan 19, 2016 7:52 am
by jonathan.cruz
$ /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.1.1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-19-2015
License: GPL
Website:
https://www.nagios.org
Reading configuration data...
Read main config file okay...
Checking objects...
Checked 5735 services.
Checked 479 hosts.
Checked 19 host groups.
Checked 8 service groups.
Checked 27 contacts.
Checked 5 contact groups.
Checked 139 commands.
Checked 33 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 479 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 33 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check