Nagios Support Forum

Posted: **Thu Jul 08, 2021 1:52 am**

Hi.

I have the following situation.
We have Nagios XI server installed on vmware (version 5.8.2).

NCPA agent (version 2.2.1) is installed on AIX server (7200-05-02-2114) and the readings for disk 'OraData' were correct for half a year now.
After extending disk from 15 TB to 16 TB (adding 1 TB), the readings are incorrect.

NCPA Output:
CRITICAL: Used disk space was -531.50 % (Used: -770.44 GiB, Free: 915.40 GiB, Total: 144.97 GiB)

Is there some limitation for disk or is this something else?

We did not change the command, it is still the same:
check_xi_ncpa!-t '$USER10$' -P 5693 -M 'disk/logical/|OraData' -w '98' -c '99'

Thanks in advance for any information.

Best regars, Aljaž

Posted: **Thu Jul 08, 2021 11:32 am**

Hello Aljaž,

Thanks for reaching out about the disk size issue.

Sounds like the size of the disk partition was resized but the filesystem on the partition was not increased to match. Please check with the man pages on resize2fs on this. And here is an article that we also provide as well.

Thanks,
Perry

Posted: **Fri Jul 09, 2021 12:35 am**

Hi, Perry.

It seems like we have misunderstood.

With Nagios XI we are monitoring AIX server (versin 7200-05-02-2114).
After extending filesystem on this AIX server from 15 TB to 16 TB we are getting wrong filesystem usage for /OraData.
For all other filesystems disk usage is showing correct values.

Here is the output on AIX server:

[email protected] /home/sa.vk# df -g
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 0,88 0,38 57% 18194 17% /
/dev/hd2 6,12 1,09 83% 77560 22% /usr
/dev/hd9var 2,50 0,91 64% 15507 7% /var
/dev/hd3 4,00 3,90 3% 2563 1% /tmp
/dev/hd1 2,12 1,20 44% 2024 1% /home
/proc - - - - - /proc
/dev/hd10opt 3,12 1,75 45% 24108 6% /opt
/dev/livedump 0,25 0,25 1% 4 1% /var/adm/ras/livedump
/dev/nmonlv 5,00 2,31 54% 800 1% /nmon
/dev/orahomelv 75,00 36,29 52% 286527 4% /OraBase1
/dev/oradatalv 16529,00 905,62 95% 654 1% /OraData
/dev/oraarchloglv 765,00 590,33 23% 731 1% /OraArchlog
/dev/oradiaglv 50,00 49,59 1% 3578 1% /OraDiag
/dev/oraflashlv 150,00 92,52 39% 322 1% /OraFlash
/dev/hd11admin 0,12 0,12 1% 11 1% /admin
/dev/RedoLog1lv 25,00 4,87 81% 14 1% /RedoLog1
/dev/RedoLog2lv 25,00 4,87 81% 14 1% /RedoLog2
jarovit:/home/razvoj01/prmzav/data 217,31 29,97 87% 271840 4% /home/razvoj01/prmzav/data
jarovit:/home/razvoj01/ppm0/data 217,31 29,97 87% 271840 4% /home/razvoj01/ppm0/data
jarovit:/home/razvoj01/sifranti/data 217,31 29,97 87% 271840 4% /home/razvoj01/sifranti/data
jarovit:/home/razvoj01/skode/data 217,31 29,97 87% 271840 4% /home/razvoj01/skode/data
jarovit:/home/razvoj01/docarch/data 217,31 29,97 87% 271840 4% /home/razvoj01/docarch/data
[email protected] /home/sa.vk#

Best regards, Aljaž

Posted: **Fri Jul 09, 2021 12:43 pm**

Hello Aljaž,

Thanks for following up with the details. We see that your '/dev/oradatalv 16529,00 905,62 95% 654 1% /OraData' results in correct total size.

We want to get some further data points from this mount point to determine what is going on.

Please run the following and provide the results:

Code: Select all

du -h -d 0 /OraData -c | grep -Ei 'total'

Verbose output on the ncpa command:

Code: Select all

/usr/local/nagios/libexec/check_ncpa.py -H [yourhostip_or_name] -t '[your_ncpa_token]' -P 5693 -M 'disk/logical/|OraData' -w '98' -c '99' -v

Thanks,
Perry

Posted: **Fri Jul 09, 2021 12:53 pm**

What type of filesystem is it? (JFS/JFS2/ext4/etc)

Did you restart the ncpa_listener service after and see if that resolves it?

NCPA uses the psutil python library to get the information, I was seeing some PPC 16TB limits for memory/JFS/JFS2 while researching this but I'm not sure how that translates into what psutil is reading from the backend or if it's related at all.

Posted: **Thu Jul 29, 2021 1:53 am**

Perry, hi.

Sorry for late reply, but I have been absent.

Here are the outputs you asked for:

dev-srv-devana@root /# /opt/freeware/bin/du -h -d 0 /OraData -c | grep -Ei 'total'
16T total

And the second one for this command:
/usr/local/nagios/libexec/check_ncpa.py -H [yourhostip_or_name] -t '[your_ncpa_token]' -P 5693 -M 'disk/logical/|OraData' -w '98' -c '99' -v

File returned contained:
{
"returncode": 2,
"stdout": "CRITICAL: Used disk space was -448.30 % (Used: -649.84 GiB, Free: 794.80 GiB, Total: 144.97 GiB) | 'used'=-649.84GiB;142;144; 'free'=794.80GiB;142;144; 'total'=144.97GiB;142;144;"
}
CRITICAL: Used disk space was -448.30 % (Used: -649.84 GiB, Free: 794.80 GiB, Total: 144.97 GiB) | 'used'=-649.84GiB;142;144; 'free'=794.80GiB;142;144; 'total'=144.97GiB;142;144;

BTW: filesystem is JFS2

Best regards, Aljaž

Posted: **Thu Jul 29, 2021 6:59 pm**

Please create a bug report for this here with your AIX system info/oslevel/etc so that the developers can investigate the issue:

https://github.com/NagiosEnterprises/ncpa/issues

You may need to use a plugin with NPCA as a workaround until they release a fix, it has to be related to psutils because that's where the data is taken from.

See here:

https://exchange.nagios.org/directory/P ... ms/details

And here:

https://support.nagios.com/kb/article/n ... a-722.html

Posted: **Wed Aug 04, 2021 4:56 am**

Bug issue created, thank you.

BR, Aljaž

Posted: **Wed Aug 04, 2021 4:47 pm**

Thank you, the developers will see it, they may ask for additional information but that looks good.

Posted: **Mon Aug 16, 2021 1:15 am**

Hi, do I have to do anything on github (add some developer or something), because there is still no answer?

Thanks, Aljaž

Nagios Support Forum

Wrong disk readings after resizing disk

Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk

Re: Wrong disk readings after resizing disk