On RHEL7 check_disk interfering with autofs timeout

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
r0tty
Posts: 13
Joined: Tue Aug 27, 2013 12:31 pm

On RHEL7 check_disk interfering with autofs timeout

Post by r0tty »

I'm seeing a problem with check_disk (from plugin v2.1.1) running on RHEL7 that I didn't see on RHEL6 - hopefully this output illustrates the problem:

Code: Select all

#  service autofs stop
Redirecting to /bin/systemctl stop  autofs.service
#  service autofs start
Redirecting to /bin/systemctl start  autofs.service
#  findmnt -Dno TARGET  | wc -l
11
 #  findmnt -Dno TARGET 
/dev
/dev/shm
/run
/sys/fs/cgroup
/
/home
/tmp
/opt
/boot
/var
/net/nas/brlhome
#  /opt/nagios/libexec/check_disk -w 20% -c 10% -l -X cifs 
DISK CRITICAL - free space: / 19273 MB (94% inode=99%); /dev 1883 MB (100% inode=99%); /dev/shm 1892 MB (100% inode=99%); /run 1884 MB (99% inode=99%); /sys/fs/cgroup 1892 MB (100% inode=99%); /home 1493 MB (97% inode=99%); /tmp 2005 MB (98% inode=99%); /opt 2003 MB (98% inode=99%); /boot 52 MB (26% inode=99%); /var 9683 MB (94% inode=99%); <snip>lots of remote filesystems managed by automount</snip>
#  findmnt -Dno TARGET  | wc -l
59
#  /opt/nagios/libexec/check_disk -w 20% -c 10% -l -X cifs 
DISK OK - free space: / 19273 MB (94% inode=99%); /dev 1883 MB (100% inode=99%); /dev/shm 1892 MB (100% inode=99%); /run 1884 MB (99% inode=99%); /sys/fs/cgroup 1892 MB (100% inode=99%); /home 1493 MB (97% inode=99%); /tmp 2005 MB (98% inode=99%); /opt 2003 MB (98% inode=99%); /boot 52 MB (26% inode=99%); /var 9683 MB (94% inode=99%);| /=1196MB;16376;18423;0;20470 /dev=0MB;1506;1694;0;1883 /dev/shm=0MB;1513;1702;0;1892 /run=8MB;1513;1702;0;1892 /sys/fs/cgroup=0MB;1513;1702;0;1892 /home=32MB;1220;1373;0;1526 /tmp=32MB;1630;1834;0;2038 /opt=34MB;1630;1834;0;2038 /boot=144MB;156;176;0;196 /var=546MB;8184;9207;0;10230
#  findmnt -Dno TARGET  | wc -l
59
So, what happened there is that the first time check_disk was run it refreshed the connections to all 49 exports available from the remote NAS that /net/nas/brlhome resides on. That's a problem because I now have 48 NFS mounts I'm not using from this server. The whole purpose of autofs (as I understand it) is to reduce resource waste on unnecessary connections, but now it is not possible for it to do its job because autofs on this system has a timeout value of 10 minutes and check_disk will refresh each mount every time it runs, which is every minute.

The second time check_disk is run it only reports on the file systems I expect it to, so the problem is not immediately obvious an issue relating to check_disk. But even though check_disk is now performing okay I have lots of NFS mounts I don't want which causes other problems.

I deally after running check_disk the first time, or any subsequent time, the findmnt command should return the original 11 devices only. This used to work fine on RHEL6 and I notice that there is also an interesting difference in the output of the mount command; RHEL6 returns active mounts and RHEL7 returns all possible mounts. There is also this comment within the RHEL7 mount(8) man page:
The listing and help.
The listing mode is maintained for backward compatibility only.

For more robust and definable output use findmnt(8), especially in your scripts.
I think that may be relevant to this problem, but I'm not 100% sure.

Has anyone else faced this problem? Has anyone got a solution? (BTW, reducing the autofs timeout is not a viable solution).

Thanks in advance for any assistance.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: On RHEL7 check_disk interfering with autofs timeout

Post by jolson »

Code: Select all

[root@NLS ~]# cat /etc/*relea*
CentOS Linux release 7.2.1511 (Core)
Derived from Red Hat Enterprise Linux 7.2 (Source)
[root@NLS ~]# mount | grep 172
172.16.0.x:/mnt/backup on /mnt/backup type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.0.x,mountvers=3,mountport=20048,mountproto=tcp,local_lock=all,addr=172.16.0.x)
[root@NLS ~]# umount /mnt/backup/
[root@NLS ~]# mount | grep 172
[root@NLS ~]#
I don't think that the mount command lists all available mounts as you described, count you perform a similar procedure on your server? I'm interested in seeing if the mount command lists your autofs mounts even while they're inactive - this could be a bug in the code of check_disk if that is the case.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: On RHEL7 check_disk interfering with autofs timeout

Post by rkennedy »

Can you also post your autofs configuration as well?
Former Nagios Employee
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: On RHEL7 check_disk interfering with autofs timeout

Post by tmcdonald »

Also, it might be a good idea to post this as an issue on the official GitHub: https://github.com/nagios-plugins/nagios-plugins

The devs will then have it on their radar and be able to work with it easier. If you don't have an account I can file for you.
Former Nagios employee
r0tty
Posts: 13
Joined: Tue Aug 27, 2013 12:31 pm

Re: On RHEL7 check_disk interfering with autofs timeout

Post by r0tty »

Hi,

Thanks for the responses everyone.

@jolson:

Agreed that it doesn't list every mount - it lists every mount available from a remote server that a connection has been established with. The example of different behaviour you've given is interesting, but can you confirm that there are other exports available from the 172 address? If there are then I would be forced to conclude that there is some factor in my environment that is activating these mounts, but I'm not aware of what it could be.

Here is another stripped down example of the behaviour:

Code: Select all

#  cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.1 (Maipo)
#  mount | wc -l
34
#  mount | grep nas | wc -l
0
#  findmnt -no TARGET | grep nas | wc -l
0
#  cd /net/nas/brlhome
#  mount | wc -l
52
#  mount | grep nas | wc -l
18
#  findmnt -no TARGET | grep nas | wc -l
18
Even findmnt returns all potential mounts. But mount used to only report active mounts in RHEL6, but that is not true for RHEL7.

@rkennedy:

Unfortunately I am not able to share all the configuration I have or to give the output of all commands (hence the recourse to 'wc -l' for comparing output rather than just listing it). Sorry about that. However, for the purpose of these tests and responses in this posting I have reverted to the standard automounter config:

Code: Select all

#  cat /etc/auto.master
/misc	/etc/auto.misc
/net /etc/auto.net --timeout=600

Code: Select all

#  cat /etc/auto.net
#!/bin/bash

# This file must be executable to work! chmod 755!

# Look at what a host is exporting to determine what we can mount.
# This is very simple, but it appears to work surprisingly well

key="$1"

# add "nosymlink" here if you want to suppress symlinking local filesystems
# add "nonstrict" to make it OK for some filesystems to not mount
opts="-fstype=nfs,hard,intr,nodev,nosuid"

# Showmount comes in a number of names and varieties.  "showmount" is
# typically an older version which accepts the '--no-headers' flag
# but ignores it.  "kshowmount" is the newer version installed with knfsd,
# which both accepts and acts on the '--no-headers' flag.
#SHOWMOUNT="kshowmount --no-headers -e $key"
#SHOWMOUNT="showmount -e $key | tail -n +2"

for P in /bin /sbin /usr/bin /usr/sbin
do
	for M in showmount kshowmount
	do
		if [ -x $P/$M ]
		then
			SMNT=$P/$M
			break
		fi
	done
done

[ -x $SMNT ] || exit 1

# Newer distributions get this right
SHOWMOUNT="$SMNT --no-headers -e $key"

$SHOWMOUNT | LC_ALL=C cut -d' ' -f1 | LC_ALL=C sort -u | \
	awk -v key="$key" -v opts="$opts" -- '
	BEGIN	{ ORS=""; first=1 }
		{ if (first) { print opts; first=0 }; print " \\\n\t" $1, key ":" $1 }
	END	{ if (!first) print "\n"; else exit 1 }
	' | sed 's/#/\\#/g'
@tmcdonald:

I think that sounds like a good idea. I don't have an account so I would appreciate it if you could file it for me, but let's wait until someone reproduces the behaviour as the possibility remains that this is caused by some element in my environment that I am unaware of.

Thanks again,
Rotty
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: On RHEL7 check_disk interfering with autofs timeout

Post by rkennedy »

I didn't finish setting up my test environment today for this - I'll do it tomorrow morning and get back to you with what I figure out.
Former Nagios Employee
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: On RHEL7 check_disk interfering with autofs timeout

Post by rkennedy »

Well I was hoping to see differences between Centos6/7, but I am not seeing much. When using regular mount / umount it doesn't seem to pick up on the inactive mounts. Even check_disk seems to be not picking up on this. Here's the results from the two machines -
[root@suse11 etc]# cat /etc/*rel*
CentOS release 6.7 (Final)

[root@suse11 etc]# mount|grep 192.168.
192.168.3.42:/mnt/test on /mnt/nfs/home type nfs (rw,vers=4,addr=192.168.3.42,clientaddr=192.168.4.254)
192.168.3.42:/mnt/test2 on /mnt/nfs/test2 type nfs (rw,vers=4,addr=192.168.3.42,clientaddr=192.168.4.254)
192.168.3.42:/mnt/test3 on /mnt/nfs/test3 type nfs (rw,vers=4,addr=192.168.3.42,clientaddr=192.168.4.254)
192.168.3.42:/mnt/test3 on /mnt/nfs/test4 type nfs (rw,vers=4,addr=192.168.3.42,clientaddr=192.168.4.254)

[root@suse11 etc]# umount /mnt/nfs/home
[root@suse11 etc]# umount /mnt/nfs/test2
[root@suse11 etc]# umount /mnt/nfs/test3
[root@suse11 etc]# umount /mnt/nfs/test4
[root@suse11 etc]# mount|grep 192.168.|wc -l
0

[root@suse11 libexec]# ./check_disk -w 5 -c 5
DISK OK - free space: / 3686 MB (52% inode=76%); /dev/shm 497 MB (100% inode=99%); /boot 380 MB (84% inode=99%);| /=3398MB;7465;7465;0;7470 /dev/shm=0MB;492;492;0;497 /boot=70MB;471;471;0;476
[root@suse11 libexec]# findmnt
TARGET SOURCE FSTYPE OPTIONS
/ /dev/mapper/VolGroup-lv_root ext4 rw,relatime,barri
├─/proc proc proc rw,relatime
│ ├─/proc/bus/usb /proc/bus/usb usbfs rw,relatime
│ └─/proc/sys/fs/binfmt_misc binfmt_m rw,relatime
├─/sys sysfs sysfs rw,relatime
├─/dev devtmpfs devtmpfs rw,relatime,size=
│ ├─/dev/pts devpts devpts rw,relatime,gid=5
│ └─/dev/shm tmpfs tmpfs rw,relatime
├─/boot /dev/sda1 ext4 rw,relatime,barri
├─/var/lib/nfs/rpc_pipefs sunrpc rpc_pipe rw,relatime
└─/mnt/nfs/test2 192.168.3.42:/mnt/test2[/test2] nfs4 rw,relatime,vers=
[root@suse11 libexec]#


------------------------------------------------------------

[root@localhost ~]# cat /etc/*rel*
CentOS Linux release 7.2.1511 (Core)

[root@localhost ~]# mount|grep 192.168.
192.168.3.42:/mnt/test on /mnt/test type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.3.174,local_lock=none,addr=192.168.3.42)
192.168.3.42:/mnt/test2 on /mnt/test2 type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.3.174,local_lock=none,addr=192.168.3.42)
192.168.3.42:/mnt/test3 on /mnt/test3 type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.3.174,local_lock=none,addr=192.168.3.42)
192.168.3.42:/mnt/test3 on /mnt/test4 type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.3.174,local_lock=none,addr=192.168.3.42)


[root@localhost ~]# umount /mnt/test
[root@localhost ~]# umount /mnt/test2
[root@localhost ~]# umount /mnt/test3
[root@localhost ~]# umount /mnt/test4
[root@localhost ~]# mount|grep 192.168.|wc -l
0
Former Nagios Employee
r0tty
Posts: 13
Joined: Tue Aug 27, 2013 12:31 pm

Re: On RHEL7 check_disk interfering with autofs timeout

Post by r0tty »

Hi,

Thanks for that, but I'm not sure if you're using automount or not? Also, if you have no mounts of the remote server I would not expect to see the behaviour - only if you maintain one of the mounts... something like this:

[root@localhost ~]# umount /mnt/test
[root@localhost ~]# umount /mnt/test2
[root@localhost ~]# umount /mnt/test3
[root@localhost ~]# mount|grep 192.168.|wc -l
4

Rotty
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: On RHEL7 check_disk interfering with autofs timeout

Post by rkennedy »

I am not using automount, and even with 1 mount left - it seems to act fine. Do you see anything that stands out to you while doing this in your environment using the watch mount command?

Here's the results -

Code: Select all

[root@suse11 libexec]# cat /etc/*rel*
CentOS release 6.7 (Final)


[root@suse11 libexec]# mount 192.168.3.42:mnt/test /mnt/test
[root@suse11 libexec]# mount 192.168.3.42:mnt/test2 /mnt/test2
[root@suse11 libexec]# mount 192.168.3.42:mnt/test3 /mnt/test3

[root@suse11 libexec]# umount /mnt/test2
[root@suse11 libexec]# umount /mnt/test3

[root@suse11 libexec]# mount|grep 192.168.|wc -l
1
Former Nagios Employee
r0tty
Posts: 13
Joined: Tue Aug 27, 2013 12:31 pm

Re: On RHEL7 check_disk interfering with autofs timeout

Post by r0tty »

Hi,

Autofs is a key component of this. It is the autofs task of scanning a remote server for available exports that loads them into mount. A normal mount does not have to do the equivalent of 'showmount -e'.

I haven't tried the watch command, but when I ran this commands before there was only a split second between the 'cd' and the 'mount'.

Code: Select all

#  findmnt -no TARGET | grep nas | wc -l
0
#  cd /net/nas/brlhome
#  mount | wc -l
52
Regards,
Rotty
Locked