CRITICAL - popen timeout received, but no child process

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: CRITICAL - popen timeout received, but no child process

Post by bosecorp »

done

still see the same. it's getting better but still
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: CRITICAL - popen timeout received, but no child process

Post by jdalrymple »

Code: Select all

[jdalrymple@localhost ~]$ for file in {1..300000}; do touch $file; done
[jdalrymple@localhost ~]$ ls -l | wc -l
300001
[jdalrymple@localhost ~]$ find ./ -type f -exec rm {} \;
[jdalrymple@localhost ~]$ ls -l | wc -l
1
It will take awhile, and it may beat up on your CPU so you may want to do it off-hours. It should work fine though.
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: CRITICAL - popen timeout received, but no child process

Post by bosecorp »

done

got better, but still
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: CRITICAL - popen timeout received, but no child process

Post by jdalrymple »

uptime and df please:

Code: Select all

[jdalrymple@localhost ~]$ uptime
 15:51:57 up 6 days,  7:03,  1 user,  load average: 0.00, 0.00, 0.02
[jdalrymple@localhost ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/centos-lv_root
                       18G  3.1G   14G  19% /
tmpfs                 491M     0  491M   0% /dev/shm
/dev/sda1             477M   28M  425M   6% /boot
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: CRITICAL - popen timeout received, but no child process

Post by bosecorp »

Code: Select all

root@nagmonus1:(03-31 17:09): /root
# uptime
 17:09:09 up 4 days, 22:25,  3 users,  load average: 3.45, 3.76, 3.88
root@nagmonus1:(03-31 17:09): /root
# df -h
Filesystem                       Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-lvroot        2.0G  820M  1.1G  43% /
tmpfs                            9.9G     0  9.9G   0% /dev/shm
/dev/sda1                        243M   49M  181M  22% /boot
/dev/mapper/rootvg-lvopt         2.0G   92M  1.8G   5% /opt
/dev/mapper/rootvg-lvtmp         6.9G   75M  6.5G   2% /tmp
/dev/mapper/rootvg-lvusers       4.0G  137M  3.7G   4% /users
/dev/mapper/rootvg-lvusr         7.9G  4.1G  3.4G  55% /usr
/dev/mapper/rootvg-lvvar          15G  5.9G  8.2G  42% /var
/dev/mapper/vgapp-lvapp           49G  4.1G   42G   9% /app
/dev/mapper/vgapp-lvstore         69G  6.6G   59G  11% /store
/dev/mapper/vgapp-lvlocalnagios  128G   28G   95G  23% /usr/local/nagios
/dev/mapper/vgapp-lvmysql         69G  2.6G   63G   4% /var/lib/mysql
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: CRITICAL - popen timeout received, but no child process

Post by ssax »

For the hosts that are experiencing the problems are you using check_icmp or check_ping or something else?
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: CRITICAL - popen timeout received, but no child process

Post by bosecorp »

I think check_icmp

remember, that I am using NRDS or NRDP or whatever is called. so the actual checks are done by the host.

but correct me if I am wrong but the actual host check is done by Nagios, right?

these host is member of this template xiwizard_passive_host, and when I go and check that host template, it's associated with anohter host template called xiwizard_generic_host, which has the following command configured "$USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1$,$ARG2$ -c $ARG3$,$ARG4$ -p 5 -t 30"
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: CRITICAL - popen timeout received, but no child process

Post by jdalrymple »

bosecorp,

Sorry, I missed you mentioning that these are passive checks.

NRDS is the tool you're using?

Can you run the check_icmp from the host command line and verify the output?

There is no chance that we now have weird situations where gearman checks are getting mixed in on hosts with NRDS checks is there? Your environment must be very complicated. Do you guys have any sketches of how it's laid out? How do you determine what is monitored by gearman and what is sending NRDS checks back in?
bosecorp
Posts: 929
Joined: Thu Jun 26, 2014 1:00 pm

Re: CRITICAL - popen timeout received, but no child process

Post by bosecorp »

these checks are being done by the JOB server

Yes, I am using NRDS.


this is happening with severall AIX-Linux clients
jdalrymple
Skynet Drone
Posts: 2620
Joined: Wed Feb 11, 2015 1:56 pm

Re: CRITICAL - popen timeout received, but no child process

Post by jdalrymple »

We need a better description of your environment I'm certain. Do you have a Visio diagram of your monitoring infrastructure so we can better understand how it works?

I'm having a hard time wrapping my mind around what you just said. The "Job server" is by definition the Nagios server, or the server with the NEB module installed and running. Passive checks on remote hosts cannot be submitted by the Nagios server.

If your environment is configured and running just as described above then I can understand why there would be issues :)
Locked