Hi guys,
We have XI 2014R.7.
Last night and into morning we have a gap in all our performance data in our PRODUCTION environment.
Nagios.log (indicating issue)
[Tue Dec 8 17:20:18 2015] Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1449613218.perfdata.service"
NPCD.log (indicates gap in perf data)
[12-08-2015 22:48:25] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1449632884.perfdata.service'
[12-09-2015 11:08:15] NPCD: ERROR: Executed command exits with return code '1'
There is no system error in /var/log/messages. Our ulimit is below. Any idea why this issue would occur? And why it would suddenly clear up at that time? Thx.
# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 95125
max locked memory (kbytes, -l) 128
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 20480
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Gap in perf data : fork my_system_r failed
Re: Gap in perf data : fork my_system_r failed
How many hosts / services are you currently checking?
Additionally, what is the result of top|head -5?
Additionally, what is the result of top|head -5?
Former Nagios Employee
Re: Gap in perf data : fork my_system_r failed
948 hosts, 6679 services. XI server + 2 mod_gearman servers in this environment.
top - 12:54:37 up 6 days, 13:52, 2 users, load average: 4.40, 4.11, 3.72
Tasks: 329 total, 2 running, 326 sleeping, 0 stopped, 1 zombie
Cpu(s): 27.5%us, 4.9%sy, 0.0%ni, 67.0%id, 0.3%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 12197752k total, 8215544k used, 3982208k free, 106536k buffers
Swap: 2064380k total, 45088k used, 2019292k free, 2584568k cached
[root@bed-600-124 var]#
top - 12:54:37 up 6 days, 13:52, 2 users, load average: 4.40, 4.11, 3.72
Tasks: 329 total, 2 running, 326 sleeping, 0 stopped, 1 zombie
Cpu(s): 27.5%us, 4.9%sy, 0.0%ni, 67.0%id, 0.3%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 12197752k total, 8215544k used, 3982208k free, 106536k buffers
Swap: 2064380k total, 45088k used, 2019292k free, 2584568k cached
[root@bed-600-124 var]#
Re: Gap in perf data : fork my_system_r failed
Is there any stacked up perfdata? It seems like your 'mv' command failed. This could be due to an abundance of stacked up performance data.
How many CPUs are in your system? Free memory? Disk space?
Code: Select all
ls -l /usr/local/nagios/var/spool/perfdata | wc -l
ls -l /usr/local/nagios/var/spool/xidpe | wc -lCode: Select all
lscpu
free -m
df -hRe: Gap in perf data : fork my_system_r failed
I may have seen the xidpe directory fill up at the inode level but still indicate plenty of disk space. Could that be an issue?
--
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
--
--
free -m
total used free shared buffers cached
Mem: 11911 9200 2711 13 146 2646
-/+ buffers/cache: 6407 5503
Swap: 2015 44 1971
--
--
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
3.9G 2.5G 1.2G 69% /
tmpfs 5.9G 0 5.9G 0% /dev/shm
/dev/sda1 92M 72M 16M 83% /boot
/dev/mapper/VolGroup00-LogVol05
2.9G 616M 2.2G 22% /home
/dev/mapper/VolGroup00-LogVol02
129G 81G 42G 67% /usr
/dev/mapper/VolGroup00-LogVol03
5.8G 4.0G 1.6G 73% /var
--
--
--
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
--
--
free -m
total used free shared buffers cached
Mem: 11911 9200 2711 13 146 2646
-/+ buffers/cache: 6407 5503
Swap: 2015 44 1971
--
--
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
3.9G 2.5G 1.2G 69% /
tmpfs 5.9G 0 5.9G 0% /dev/shm
/dev/sda1 92M 72M 16M 83% /boot
/dev/mapper/VolGroup00-LogVol05
2.9G 616M 2.2G 22% /home
/dev/mapper/VolGroup00-LogVol02
129G 81G 42G 67% /usr
/dev/mapper/VolGroup00-LogVol03
5.8G 4.0G 1.6G 73% /var
--
--
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Gap in perf data : fork my_system_r failed
Yes, running out of inodes is a big problem.brdr wrote:I may have seen the xidpe directory fill up at the inode level but still indicate plenty of disk space. Could that be an issue?
You will need to add more space to that disk and that will increase the amount of available inodes.
Brings back memories of MS-DOS 6 and partitioning the 250MB hard disk into smaller partitions to prevent wasted space
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Gap in perf data : fork my_system_r failed
i know right!
Thanks for info. Before adding space to partition i want to see if the maxing out of the inodes is the real problem. We had some strange behaviour last week too (Nagios XI hung) so i hope this is related.
I found a good plugin to check inode on XI server and added a check for the partition.
Here is the plugin case anyone is interested..
https://exchange.nagios.org/directory/P ... es/details
Thanks. Please close.
Thanks for info. Before adding space to partition i want to see if the maxing out of the inodes is the real problem. We had some strange behaviour last week too (Nagios XI hung) so i hope this is related.
I found a good plugin to check inode on XI server and added a check for the partition.
Here is the plugin case anyone is interested..
https://exchange.nagios.org/directory/P ... es/details
Thanks. Please close.
Re: Gap in perf data : fork my_system_r failed
I will now close this thread out. Feel free to open a new one if you need more assistance in the future.
Former Nagios Employee