Page 3 of 4
Re: APC UPS MONITORIN
Posted: Mon Jun 13, 2016 7:27 pm
by orani
misunderstanding... the problem with perl library solved. when i execute the check script from command line it gives the right result but at the nagios web interface i am getting the error "(no output returned from plugin)"
Re: APC UPS MONITORIN
Posted: Mon Jun 13, 2016 9:01 pm
by Box293
orani wrote:will it work because as i read this tutorial is about nagios xi. i got nagios core.
at the important_application i have to replace it with perl?
i dont have oracle installed. what about the second line i should add?
The article, while targeted at XI, is actually identical for Nagios Core, this article is only touching Nagios Core components.
The article is an "example article" ... it's instructing you how to make the environment variables available to Core when it runs. The values in the article are examples, they are not specific to your problem. You need to work out what varaibles need updating (like the PATH to include the library to the perl module).
orani wrote:misunderstanding... the problem with perl library solved. when i execute the check script from command line it gives the right result but at the nagios web interface i am getting the error "(no output returned from plugin)"
What is the exact command you are executing at the command line along with the output?
Show us your command and service definition you created.
Re: APC UPS MONITORIN
Posted: Tue Jun 14, 2016 12:59 pm
by orani
That are my definitions:
Code: Select all
define command{
command_name check_ups_powerware
command_line $USER1$/check_ups_powerware.pl -H $HOSTADDRESS$ -C $ARG$
}
Code: Select all
define service {
host_name 10.0.5.112
service_description Battery Remaining
check_command check_ups_powerware!public!battery_remaining!-U h!$
max_check_attempts 1000
check_interval 1
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contacts nagiosadmin
}
When i run the command from command line:
Code: Select all
./check_ups_powerware.pl -H 10.0.5.112 -C public -T battery_remaining -U h
i get as result this:
Code: Select all
# OK: Battery Remaining: 5.68 Hours|BatteryRemaining=5.68Hours;;;;
but at same time nagios web interface gives me "no output returned from plugin"
Re: APC UPS MONITORIN
Posted: Tue Jun 14, 2016 5:05 pm
by Box293
We may need to turn on debugging in Nagios, make the problem occur and then review the debug logs.
Try setting the debug level on and then restart Nagios.
Code: Select all
sed -i 's/.*debug_level=.*/debug_level=-1/g' /usr/local/nagios/etc/nagios.cfg
service nagios restart
Make the problem occur (scheduled an immediate check)
Look at the file
/usr/local/nagios/var/nagios.debug to find the check being executed.
When you are finished this turns debugging off:
Code: Select all
sed -i 's/.*debug_level=.*/debug_level=0/g' /usr/local/nagios/etc/nagios.cfg
service nagios restart
Does this help?
Re: APC UPS MONITORIN
Posted: Wed Jun 15, 2016 7:21 am
by orani
The nagios.debug file contains the following and the host i am trying to monitor is 10.0.5.112
Code: Select all
[1465991772.300099] [008.0] [pid=27879] ** Service Check Event ==> Host: '10.0.5.112', Service: 'Battery Remaining', Options: 0, Latency: 0.000000 sec
[1465991772.300114] [001.0] [pid=27879] run_scheduled_service_check() start
[1465991772.300126] [016.0] [pid=27879] Attempting to run scheduled check of service 'Battery Remaining' on host '10.0.5.112': check options=0, latency=0.000000
[1465991772.300140] [001.0] [pid=27879] run_async_service_check()
[1465991772.300153] [001.0] [pid=27879] check_service_check_viability()
[1465991772.300166] [001.0] [pid=27879] check_time_against_period()
[1465991772.300183] [001.0] [pid=27879] _get_matching_timerange()
[1465991772.300201] [001.0] [pid=27879] check_service_dependencies()
[1465991772.300217] [064.1] [pid=27879] Making callbacks (type 6)...
[1465991772.300230] [016.0] [pid=27879] Checking service 'Battery Remaining' on host '10.0.5.112'...
[1465991772.300247] [001.0] [pid=27879] get_raw_command_line_r()
[1465991772.300261] [001.0] [pid=27879] process_macros_r()
[1465991772.300274] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.300286] [2048.1] [pid=27879] Processing: 'public'
[1465991772.300300] [2048.1] [pid=27879] Done. Final output: 'public'
[1465991772.300312] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.300324] [001.0] [pid=27879] process_macros_r()
[1465991772.300337] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.300349] [2048.1] [pid=27879] Processing: 'battery_remaining'
[1465991772.300362] [2048.1] [pid=27879] Done. Final output: 'battery_remaining'
[1465991772.300375] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.300387] [001.0] [pid=27879] process_macros_r()
[1465991772.300399] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.300411] [2048.1] [pid=27879] Processing: '-U h'
[1465991772.300424] [2048.1] [pid=27879] Done. Final output: '-U h'
[1465991772.300437] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.300462] [001.0] [pid=27879] process_macros_r()
[1465991772.300474] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.300486] [2048.1] [pid=27879] Processing: '-w 1'
[1465991772.300499] [2048.1] [pid=27879] Done. Final output: '-w 1'
[1465991772.300512] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.300524] [001.0] [pid=27879] process_macros_r()
[1465991772.300536] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.300548] [2048.1] [pid=27879] Processing: '-c 2'
[1465991772.300561] [2048.1] [pid=27879] Done. Final output: '-c 2'
[1465991772.300574] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.300586] [001.0] [pid=27879] process_macros_r()
[1465991772.300598] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.300610] [2048.1] [pid=27879] Processing: ''
[1465991772.300623] [2048.1] [pid=27879] Done. Final output: ''
[1465991772.300636] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.300648] [001.0] [pid=27879] process_macros_r()
[1465991772.300660] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.300672] [2048.1] [pid=27879] Processing: '$USER1$/check_ups_powerware.pl -H $HOSTADDRESS$ -C $ARG$'
[1465991772.300690] [2048.0] [pid=27879] WARNING: An error occurred processing macro 'ARG'!
[1465991772.300704] [2048.1] [pid=27879] Done. Final output: '/usr/local/nagios/libexec/check_ups_powerware.pl -H 10.0.5.112 -C $ARG$'
[1465991772.300717] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.300750] [064.1] [pid=27879] Making callbacks (type 6)...
[1465991772.300771] [001.0] [pid=27879] macros_to_kvv()
[1465991772.300791] [001.0] [pid=27879] clear_volatile_macros_r()
[1465991772.300805] [001.0] [pid=27879] handle_timed_event() end
[1465991772.300819] [064.1] [pid=27879] Making callbacks (type 1)...
[1465991772.300834] [008.1] [pid=27879] ** Event Check Loop
[1465991772.300853] [008.1] [pid=27879] Next Event Time: Wed Jun 15 14:56:14 2016
[1465991772.300866] [008.1] [pid=27879] Current/Max Service Checks: 1/0 (inf% saturation)
[1465991772.300881] [12288.1] [pid=27879] ## Polling 1500ms; sockets=6; events=162; iobs=0x8d810a8
[1465991772.503632] [001.0] [pid=27879] handle_async_service_check_result()
[1465991772.503679] [016.0] [pid=27879] ** Handling check result for service 'Battery Remaining' on host '10.0.5.112' from 'Core Worker 27881'...
[1465991772.503695] [016.1] [pid=27879] HOST: 10.0.5.112, SERVICE: Battery Remaining, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 3, OUTPUT:
Check type has not been defined
Usage: /usr/local/nagios/libexec/check_ups_powerware.pl [-v] -H <host> -C <snmp_community> [-2] | (-l login -x password) [-P <port>] [-T <check_type> alarm_status|battery_monitoring_status|battery_remaining|battery_test_status|firmware_version|firmware_version_nic|global_status|model|output_voltage|output_load|output_current|output_freq|input_voltage|input_freq|input_current|serial_number] [-U <unit> h|m|s] [-w <warning value>] [-c <critical value>] -r [-t <timeout>] [-V] -O <Number of Phases>
For detailed help type "check_ups_powerware.pl --help | more"
[1465991772.503747] [001.0] [pid=27879] get_service_check_return_code()
[1465991772.503767] [016.1] [pid=27879] Service is in a non-OK state!
[1465991772.503779] [016.1] [pid=27879] Host is currently UP, so we'll recheck its state to make sure...
[1465991772.503792] [001.0] [pid=27879] schedule_host_check()
[1465991772.503954] [016.0] [pid=27879] Scheduling a non-forced, active check of host '10.0.5.112' @ Wed Jun 15 14:56:12 2016
[1465991772.503979] [064.1] [pid=27879] Making callbacks (type 1)...
[1465991772.503995] [001.0] [pid=27879] add_event()
[1465991772.504013] [064.1] [pid=27879] Making callbacks (type 12)...
[1465991772.504026] [016.1] [pid=27879] Current/Max Attempt(s): 2/1000
[1465991772.504039] [016.1] [pid=27879] Host is UP, so we'll retry the service check...
[1465991772.504149] [064.1] [pid=27879] Making callbacks (type 2)...
[1465991772.504163] [001.0] [pid=27879] handle_service_event()
[1465991772.504177] [064.1] [pid=27879] Making callbacks (type 23)...
[1465991772.504191] [001.0] [pid=27879] run_global_service_event_handler()
[1465991772.504204] [001.0] [pid=27879] clear_volatile_macros_r()
[1465991772.504225] [016.1] [pid=27879] Rescheduling next check of service at Wed Jun 15 14:57:12 2016
[1465991772.504239] [001.0] [pid=27879] get_next_valid_time()
[1465991772.504258] [001.0] [pid=27879] _get_matching_timerange()
[1465991772.504277] [001.0] [pid=27879] schedule_service_check()
[1465991772.504297] [016.0] [pid=27879] Scheduling a non-forced, active check of service 'Battery Remaining' on host '10.0.5.112' @ Wed Jun 15 14:57:12 2016
[1465991772.504311] [001.0] [pid=27879] add_event()
[1465991772.504327] [064.1] [pid=27879] Making callbacks (type 13)...
[1465991772.504341] [064.1] [pid=27879] Making callbacks (type 6)...
[1465991772.504355] [064.1] [pid=27879] Making callbacks (type 13)...
[1465991772.504367] [001.0] [pid=27879] check_for_service_flapping()
[1465991772.504380] [016.1] [pid=27879] Checking service 'Battery Remaining' on host '10.0.5.112' for flapping...
[1465991772.504393] [001.0] [pid=27879] check_for_host_flapping()
[1465991772.504405] [016.1] [pid=27879] Checking host '10.0.5.112' for flapping...
[1465991772.504419] [016.1] [pid=27879] Host is not flapping (0.00% state change).
[1465991772.504437] [008.1] [pid=27879] ** Event Check Loop
[1465991772.504456] [008.1] [pid=27879] Next Event Time: Wed Jun 15 14:56:12 2016
[1465991772.504469] [008.1] [pid=27879] Current/Max Service Checks: 0/0 (-nan% saturation)
[1465991772.504485] [12288.1] [pid=27879] ## Polling 0ms; sockets=6; events=163; iobs=0x8d810a8
[1465991772.504502] [001.0] [pid=27879] handle_timed_event() start
[1465991772.504515] [064.1] [pid=27879] Making callbacks (type 1)...
[1465991772.504534] [008.0] [pid=27879] ** Timed Event ** Type: EVENT_HOST_CHECK, Run Time: Wed Jun 15 14:56:12 2016
[1465991772.504549] [008.0] [pid=27879] ** Host Check Event ==> Host: '10.0.5.112', Options: 8, Latency: 0.000540 sec
[1465991772.504565] [001.0] [pid=27879] run_scheduled_host_check()
[1465991772.504577] [016.0] [pid=27879] Attempting to run scheduled check of host '10.0.5.112': check options=8, latency=0.000540
[1465991772.504591] [001.0] [pid=27879] run_async_host_check(10.0.5.112 ...)
[1465991772.504604] [016.0] [pid=27879] ** Running async check of host '10.0.5.112'...
[1465991772.504617] [016.0] [pid=27879] Host '10.0.5.112' passed first hurdle (caching/execution)
[1465991772.504630] [001.0] [pid=27879] check_host_check_viability()
[1465991772.504643] [001.0] [pid=27879] check_time_against_period()
[1465991772.504661] [001.0] [pid=27879] _get_matching_timerange()
[1465991772.504679] [001.0] [pid=27879] check_host_dependencies()
[1465991772.504694] [064.1] [pid=27879] Making callbacks (type 7)...
[1465991772.504707] [016.0] [pid=27879] Checking host '10.0.5.112'...
[1465991772.504720] [001.0] [pid=27879] adjust_host_check_attempt()
[1465991772.504756] [001.0] [pid=27879] get_raw_command_line_r()
[1465991772.504771] [001.0] [pid=27879] process_macros_r()
[1465991772.504784] [2048.1] [pid=27879] **** BEGIN MACRO PROCESSING ***********
[1465991772.504796] [2048.1] [pid=27879] Processing: '$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5'
[1465991772.504814] [2048.1] [pid=27879] Done. Final output: '/usr/local/nagios/libexec/check_ping -H 10.0.5.112 -w 3000.0,80% -c 5000.0,100% -p 5'
[1465991772.504828] [2048.1] [pid=27879] **** END MACRO PROCESSING *************
[1465991772.504843] [064.1] [pid=27879] Making callbacks (type 7)...
[1465991772.504862] [001.0] [pid=27879] macros_to_kvv()
[1465991772.505946] [001.0] [pid=27879] clear_volatile_macros_r()
[1465991772.505971] [001.0] [pid=27879] handle_timed_event() end
[1465991772.505986] [064.1] [pid=27879] Making callbacks (type 1)...
[1465991772.506002] [008.1] [pid=27879] ** Event Check Loop
[1465991772.506042] [008.1] [pid=27879] Next Event Time: Wed Jun 15 14:56:14 2016
[1465991772.506055] [008.1] [pid=27879] Current/Max Service Checks: 0/0 (-nan% saturation)
[1465991772.506070] [12288.1] [pid=27879] ## Polling 1500ms; sockets=6; events=162; iobs=0x8d810a8
[1465991774.007633] [12288.1] [pid=27879] ## Polling 227ms; sockets=6; events=162; iobs=0x8d810a8
[1465991774.234975] [001.0] [pid=27879] handle_timed_event() start
[1465991774.235035] [064.1] [pid=27879] Making callbacks (type 1)...
Re: APC UPS MONITORIN
Posted: Wed Jun 15, 2016 4:20 pm
by tgriep
It looks like the command and the check definition are not correct.
Can you edit the command and change it to the following
Code: Select all
command_line $USER1$/check_ups_powerware.pl -H $HOSTADDRESS$ -C $ARG1$ -T $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ $ARG7$ $ARG8$
Then edit the service check and change that command to this
Code: Select all
check_command check_ups_powerware!public!battery_remaining!-U m!-w 15!-c 8
Save it and restart nagios.
Then the error should be gone and the check should function for you.
Re: APC UPS MONITORIN
Posted: Wed Jun 15, 2016 4:22 pm
by tmcdonald
Look at your command definition:
Code: Select all
define command{
command_name check_ups_powerware
command_line $USER1$/check_ups_powerware.pl -H $HOSTADDRESS$ -C $ARG$
}
$ARG$ needs to be
$ARG1$ - having no number is not valid. It also appears that you are passing more args beyond that, but are not using them in the
command_line definition. You'll need to fix the
command_line for this to work. In particular, you are missing the
-U and
-T args, so it should be:
Code: Select all
define command{
command_name check_ups_powerware
command_line $USER1$/check_ups_powerware.pl -H $HOSTADDRESS$ -C $ARG1$ -T $ARG2$ $ARG3$
}
I am not sure what the extra
$ means at the end of the service's
check_command but it is likely in error.
Edit: Damn,
@tgriep beat me to it :)
Re: APC UPS MONITORIN
Posted: Wed Jun 15, 2016 5:24 pm
by orani
I am not sure what the extra $ means at the end of the service's check_command but it is likely in error.
This a "copy error". This symbol means that the line does not ends there when you see it at the terminal.
I will try change the definition of the command as you said in a few hours when I will be at office and I will let you know about the result of the change.
I hope this solve the problem.
Re: APC UPS MONITORIN
Posted: Thu Jun 16, 2016 9:23 am
by mcapra
Let us know of any additional developments!
Re: APC UPS MONITORIN
Posted: Thu Jun 16, 2016 12:10 pm
by orani
it worked! Not all checks but most of them are ok!