Page 1 of 1

Notifications in UI show "OK: Failed Uploads are at (Linux)"

Posted: Mon Jun 03, 2024 8:45 am
by fsbeaunix
We have a few nagios notifications that run a query against the DB and are set to go off if it's above a certain number. They were working as expected for a long time until recently it shows:
OK: Failed Uploads are at (Linux)
OK: Funding Uploads (FU) are at PATH
OK: Overall Uploads are at (Linux)
OK: Post Closing Multi Doc Uploads (PC) are at permissions

When running it through the command line it returns results. When running it through the nagios UI it returns the above results (also in attached screen shot).

Command line output ran as myself (kkuzmano), nagios and also root.
[kkuzmano@nagiosh01 libexec]$ ./check_oracle_health_failed_uploads "Failed Uploads"
OK: Failed Uploads are at 0

[kkuzmano@nagiosh01 libexec]$ /usr/local/nagios/libexec/check_oracle_health_failed_uploads "Failed Uploads"
OK: Failed Uploads are at 0

[nagios@nagiosh01 libexec]$ ./check_oracle_health_failed_uploads "Failed Uploads"
OK: Failed Uploads are at 0

[nagios@nagiosh01 libexec]$ /usr/local/nagios/libexec/check_oracle_health_failed_uploads "Failed Uploads"
OK: Failed Uploads are at 0

[root@nagiosh01 libexec]# ./check_oracle_health_failed_uploads "Failed Uploads"
OK: Failed Uploads are at 0

[root@nagiosh01 libexec]# /usr/local/nagios/libexec/check_oracle_health_failed_uploads "Failed Uploads"
OK: Failed Uploads are at 0

Permissions are:
-rwxrwxrwx 1 apache nagios 552 Oct 20 2021 check_oracle_health_failed_uploads

The files have not been touched since Oct 2021. We have other scripts that follow the same concept as connecting to the DB and running a query to check the table and notify us if a queue is above a certain number. I have attached the profile and also a screen shot of the script. I have checked the permissions. They match the other scripts that are working. It looks like the script is half working as it gives some output but not the result it should be giving.

I've added a screen shot of:
The script
What the UI looks like with the returned results
Profile

Re: Notifications in UI show "OK: Failed Uploads are at (Linux)"

Posted: Mon Jun 03, 2024 5:02 pm
by lgute
Hi @fsbeaunix, thanks for reaching out.

We could use a bit more information to try and help you out. What version of XI are you running, what distro/version?

Re: Notifications in UI show "OK: Failed Uploads are at (Linux)"

Posted: Tue Jun 04, 2024 10:07 am
by fsbeaunix
Hello,

[kkuzmano@nagiosh01 var]$ cat /usr/local/nagiosxi/var/xiversion|grep full
full=5.11.2

I would like to also add that the screen shot of the script that is being ran also uses another script.

The one i've been working on to troubleshoot is Failed upload status
Config name in UI: Failed upload status
Check command: check_oracle_health_failed_uploads
Command view: /usr/local/nagios/libexec/check_oracle_health_failed_uploads $ARG1$

Inside the script check_oracle_health_failed_uploads it points to another script:
#!/bin/bash

SQL="select count(*) from ISCMS.LIVE_UPLOADS_TB where PROCESS_STATUS = 'FAIL'"
DB_NAME=$1
export UPLOAD=`/usr/local/nagios/libexec/check_oracle_health --connect FRS --username Removed --password Removed -mode sql --name="$SQL" --warning 45 --critical 50 | awk '{ print $11 }'`
#echo $UPLOAD
#echo $DB_NAME

the upload variable is set to point to the other script:
export UPLOAD=`/usr/local/nagios/libexec/check_oracle_health
When i look at the permissions it has apache:nagios for check_oracle_health

In the UI when it should give the number it gives permissions, linux, path
expected result: OK: Failed Uploads are at 15!
What it gives: OK: Failed Uploads are at (Linux)

Re: Notifications in UI show "OK: Failed Uploads are at (Linux)"

Posted: Tue Jun 04, 2024 11:28 am
by swolf
Hi @fsbeaunix,

it looks to me like you're retrieving UPLOAD by grabbing a specific column of the check_oracle_health plugin output. I would recommend grabbing the full output of the plugin and printing that out to find any differences - i.e.

Code: Select all

COMMAND=/usr/local/nagios/libexec/check_oracle_health --connect FRS --username Removed --password Removed -mode sql --name="$SQL" --warning 45 --critical 50
echo COMMAND is: $COMMAND
COMMAND_OUTPUT=$(COMMAND)
echo COMMAND_OUTPUT is: $COMMAND_OUTPUT
UPLOAD=$(echo $COMMAND_OUTPUT | awk '{ print $11 }')
echo UPLOAD is: $UPLOAD
My guess is that either you're running a different command in the terminal without being aware of it (i.e. possibly ARG1 isn't being evaluated properly), or you somehow get an unstable output from the command and will have to parse it differently.

I'd also recommend running the command in the CCM's edit service page as well, to see if the command line is different from what you're typing in any way.

-Sebastian