Instead of marking error as critical in red, put it as Ok an
-
fraguillen
- Posts: 47
- Joined: Fri May 18, 2018 11:40 am
Instead of marking error as critical in red, put it as Ok an
Good Morning
I have this problem:
I have a script that returns a value that is related to the status of a service, Ok, warning or critical, the problem is that it indicates any of those states but in the Nagios XI interface it always indicates it to me as OK and in red color but in the status column of the information I
shows as Error.
This is script:
STATE_CRITICAL=2
STATE_OK=0
process="$1"
if [ $(hostname) = "sclc-omc01-prod" -o $(hostname) = "sclc-omc02-prod" ] ; then
CANT=$( ps -fu oracle| grep -v grep | grep -c "$process" 2> /dev/null)
if [ $CANT -gt 0 ]; then
echo "OK: Se encuentra activo el proceso: $process"
Status=$STATE_OK
else
echo "ERROR: Se encuentra caido el proceso, favor revisar para levantar: $process";
Status=$STATE_CRITICAL
fi
else
echo "ERROR: No se encuentra correctamente configurada la alarma, solo corre en el servidor sclc-omc01-prod o sclc-omc02-prod";
Status=$STATE_CRITICAL
fi
echo "=======fin======="
exit $Status
Best regards...
I have this problem:
I have a script that returns a value that is related to the status of a service, Ok, warning or critical, the problem is that it indicates any of those states but in the Nagios XI interface it always indicates it to me as OK and in red color but in the status column of the information I
shows as Error.
This is script:
STATE_CRITICAL=2
STATE_OK=0
process="$1"
if [ $(hostname) = "sclc-omc01-prod" -o $(hostname) = "sclc-omc02-prod" ] ; then
CANT=$( ps -fu oracle| grep -v grep | grep -c "$process" 2> /dev/null)
if [ $CANT -gt 0 ]; then
echo "OK: Se encuentra activo el proceso: $process"
Status=$STATE_OK
else
echo "ERROR: Se encuentra caido el proceso, favor revisar para levantar: $process";
Status=$STATE_CRITICAL
fi
else
echo "ERROR: No se encuentra correctamente configurada la alarma, solo corre en el servidor sclc-omc01-prod o sclc-omc02-prod";
Status=$STATE_CRITICAL
fi
echo "=======fin======="
exit $Status
Best regards...
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Instead of marking error as critical in red, put it as O
Can you run the script as it would be run from the CLI and then run
Please show the output
Code: Select all
echo $?-
fraguillen
- Posts: 47
- Joined: Fri May 18, 2018 11:40 am
Re: Instead of marking error as critical in red, put it as O
[root@prod libexec]# ./check_val_process_bash LDM_Worker_check.sh
ERROR! Se encuentra caido el proceso, favor revisar para levantar: LDM_Worker_check.sh
=======fin=======
[root@prod libexec]# echo $?
2
[root@prod libexec]#
ERROR! Se encuentra caido el proceso, favor revisar para levantar: LDM_Worker_check.sh
=======fin=======
[root@prod libexec]# echo $?
2
[root@prod libexec]#
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Instead of marking error as critical in red, put it as O
Is this consistent with what It is displaying in the UI?
Also, do you get the same results if you run this as the nagios user?
Also, do you get the same results if you run this as the nagios user?
Code: Select all
su nagios
./check_val_process_bash LDM_Worker_check.sh
echo $?-
fraguillen
- Posts: 47
- Joined: Fri May 18, 2018 11:40 am
Re: Instead of marking error as critical in red, put it as O
In the UI in the column Status shows OK in green, but in the column Status Information shows:
ERROR! Se encuentra caido el proceso, favor revisar para levantar: LDM_Worker_check.sh
[nagios@prod libexec]$ whoami
nagios
[nagios@prod libexec]$ ./check_val_process_bash LDM_Worker_check.sh
ERROR! Se encuentra caido el proceso, favor revisar para levantar: LDM_Worker_check.sh
=======fin=======
[nagios@prod libexec]$ echo $?
2
[nagios@prod libexec]$
ERROR! Se encuentra caido el proceso, favor revisar para levantar: LDM_Worker_check.sh
[nagios@prod libexec]$ whoami
nagios
[nagios@prod libexec]$ ./check_val_process_bash LDM_Worker_check.sh
ERROR! Se encuentra caido el proceso, favor revisar para levantar: LDM_Worker_check.sh
=======fin=======
[nagios@prod libexec]$ echo $?
2
[nagios@prod libexec]$
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Instead of marking error as critical in red, put it as O
Are you sure you are looking at the service and not the OK for the host?fraguillen wrote:In the UI in the column Status shows OK in green, but in the column Status Information shows:
ERROR! Se encuentra caido el proceso, favor revisar para levantar: LDM_Worker_check.sh
If you drill into the service details (click service name) does it show CRITICAL?
-
fraguillen
- Posts: 47
- Joined: Fri May 18, 2018 11:40 am
Re: Instead of marking error as critical in red, put it as O
How can I upload an image?
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Instead of marking error as critical in red, put it as O
Underneath the submit button on the forum is an "Upload attachment" tab, click that
select the file
Click the "Add the file" button
select the file
Click the "Add the file" button
-
fraguillen
- Posts: 47
- Joined: Fri May 18, 2018 11:40 am
Re: Instead of marking error as critical in red, put it as O
Attached image of Nagios XI
You do not have the required permissions to view the files attached to this post.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Instead of marking error as critical in red, put it as O
I just looked at your script again and it is possible you are missing a new line at the end of the script
May I suggest the following modifications
May I suggest the following modifications
Code: Select all
STATE_CRITICAL=2
STATE_OK=0
process="$1"
if [ $(hostname) = "sclc-omc01-prod" -o $(hostname) = "sclc-omc02-prod" ] ; then
CANT=$( ps -fu oracle| grep -v grep | grep -c "$process" 2> /dev/null)
if [ $CANT -gt 0 ]; then
echo "OK: Se encuentra activo el proceso: $process"
exit $STATE_OK
else
echo "ERROR: Se encuentra caido el proceso, favor revisar para levantar: $process";
exit $STATE_CRITICAL
fi
else
echo "ERROR: No se encuentra correctamente configurada la alarma, solo corre en el servidor sclc-omc01-prod o sclc-omc02-prod";
exit $STATE_CRITICAL
fi