Page 1 of 2
Wrong current_state with check_tcp command
Posted: Mon Jan 15, 2018 10:36 am
by naitsab
Hello, I have a command using
check_tcp it returns exit code 2 but the
status.dat file shows a
current_state=1. Anyone know where it could come from, I expect it to be 2?
Nagios environment:
Code: Select all
#Centos6.8
nagios-3.5.1-1.el6.x86_64
nagios-plugins-1.4.16-10.el6.x86_64
nagios-plugins-tcp-1.4.16-10.el6.x86_64
The command:
Code: Select all
define command{
command_name check_tcp
command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$
}
From the terminal, command output:
Code: Select all
-bash-4.1$ /usr/lib64/nagios/plugins/check_tcp -H some_mysql_db_host.com -p 3306
CRITICAL - Socket timeout after 10 seconds
-bash-4.1$ echo $?
2
The state in status.dat
Code: Select all
$ grep $somefilter /var/log/nagios/status.dat
hoststatus {
host_name=hostname
[...]
current_state=1
[...]
}
Re: Wrong current_state with check_tcp command
Posted: Mon Jan 15, 2018 4:01 pm
by cdienger
What is status_update_interval set to in /usr/local/nagios/etc/nagios.cfg? Are there any errrors or warnings when you run /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg or anything odd logged in /var/log/messages? Make sure it is running with ps aux| grep nagios.cfg and service nagios status.
Please provide the output of the commands as well as a copy of nagios.cfg if the above doesn't help resolve things.
Re: Wrong current_state with check_tcp command
Posted: Tue Jan 16, 2018 6:41 am
by naitsab
Hello
My status_update_interval is set to 10, just one warning (over more than 1000 hosts) about a host, that does not have services associated with it, but its normal in our case. Nothing fancy in /var/log/messages neither, and service is running properly (monitoring 1000s hosts, working properly) on host with check_tcp has this mismatch with current_state.
Re: Wrong current_state with check_tcp command
Posted: Tue Jan 16, 2018 7:21 am
by naitsab
Here is the config
Code: Select all
log_file=/var/log/nagios/nagios.log
cfg_file=/etc/nagios/objects/commands.cfg
cfg_file=/etc/nagios/objects/contacts.cfg
cfg_file=/etc/nagios/objects/timeperiods.cfg
cfg_file=/etc/nagios/objects/templates.cfg
cfg_dir=/etc/nagios/conf.d
object_cache_file=/var/log/nagios/objects.cache
precached_object_file=/var/log/nagios/objects.precache
resource_file=/etc/nagios/private/resource.cfg
status_file=/var/log/nagios/status.dat
status_update_interval=10
nagios_user=nagios
nagios_group=nagios
check_external_commands=1
command_check_interval=-1
command_file=/var/spool/nagios/cmd/nagios.cmd
external_command_buffer_slots=4096
lock_file=/var/run/nagios.pid
temp_file=/var/log/nagios/nagios.tmp
temp_path=/tmp
event_broker_options=-1
log_rotation_method=d
log_archive_path=/var/log/nagios/archives
use_syslog=1
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=0
log_external_commands=1
log_passive_checks=1
service_inter_check_delay_method=s
max_service_check_spread=60
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=60
max_concurrent_checks=0
check_result_reaper_frequency=10
max_check_result_reaper_time=300
check_result_path=/var/log/nagios/spool/checkresults
max_check_result_file_age=3600
cached_host_check_horizon=15
cached_service_check_horizon=15
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
soft_state_dependencies=0
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=600
host_check_timeout=300
event_handler_timeout=300
notification_timeout=300
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=/var/log/nagios/retention.dat
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
interval_length=60
check_for_updates=1
bare_update_check=0
use_aggressive_host_checking=0
execute_service_checks=1
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=1
enable_event_handlers=1
process_performance_data=0
obsess_over_services=0
obsess_over_hosts=0
translate_passive_host_checks=0
passive_host_checks_are_soft=0
check_for_orphaned_services=1
check_for_orphaned_hosts=1
check_service_freshness=1
service_freshness_check_interval=60
check_host_freshness=0
host_freshness_check_interval=60
additional_freshness_latency=15
enable_flap_detection=1
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=us
p1_file=/usr/sbin/p1.pl
enable_embedded_perl=1
use_embedded_perl_implicitly=1
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
illegal_macro_output_chars=`~$&|'"<>
use_regexp_matching=1
use_true_regexp_matching=0
admin_email=HIDDEN
admin_pager=pagenagios@localhost
daemon_dumps_core=0
use_large_installation_tweaks=0
enable_environment_macros=1
debug_level=0
debug_verbosity=1
debug_file=/var/log/nagios/nagios.debug
max_debug_file_size=1000000
Re: Wrong current_state with check_tcp command
Posted: Tue Jan 16, 2018 3:37 pm
by cdienger
Can we get the complete portion of the status.datshowing this host's status?
Re: Wrong current_state with check_tcp command
Posted: Wed Jan 17, 2018 5:00 am
by naitsab
Here it is:
Code: Select all
hoststatus {
host_name=HIDDEN
modified_attributes=0
check_command=check_tcp!3306
check_period=
notification_period=24x7
check_interval=5.000000
retry_interval=1.000000
event_handler=
has_been_checked=1
should_be_scheduled=1
check_execution_time=10.010
check_latency=0.187
check_type=0
current_state=1
last_hard_state=1
last_event_id=3126772
current_event_id=3126773
current_problem_id=1649640
last_problem_id=0
plugin_output=CRITICAL - Socket timeout after 10 seconds
long_plugin_output=
performance_data=
last_check=1516183109
next_check=1516183424
check_options=0
current_attempt=1
max_attempts=4
state_type=1
last_state_change=1516015896
last_hard_state_change=1516015896
last_time_up=1516015308
last_time_down=1516183124
last_time_unreachable=0
last_notification=1516182509
next_notification=1516184309
no_more_notifications=0
current_notification_number=91
current_notification_id=5277244
notifications_enabled=1
problem_has_been_acknowledged=0
acknowledgement_type=0
active_checks_enabled=1
passive_checks_enabled=1
event_handler_enabled=1
flap_detection_enabled=1
failure_prediction_enabled=1
process_performance_data=1
obsess_over_host=1
last_update=1516183124
is_flapping=0
percent_state_change=0.00
scheduled_downtime_depth=0
}
Re: Wrong current_state with check_tcp command
Posted: Wed Jan 17, 2018 5:41 pm
by npolovenko
@naitsab, Would you be able to upload the source code for the check_tcp plugin? Also, does the plugin state update in the web interface? Is it red-critical, or is it yellow? Please send us a screenshot from the Web UI. Also, please force the service check from the web interface and then check the status.dat file one more time to see if the last_update_time has been changed.
Re: Wrong current_state with check_tcp command
Posted: Thu Jan 18, 2018 3:10 am
by naitsab
Hey, I cannot give you the source code. It was installed from an rpm, with the version: 1.4.16-10.el6.x86_64.
The UI is red critical. I am not sure what you mean by 'Force' in UI, I guess its a re-schedule manually right? if so state does not change.
Re: Wrong current_state with check_tcp command
Posted: Fri Jan 19, 2018 1:11 pm
by mcapra
A host object has 3 potential states: UP (0) and DOWN (1) and UNREACHABLE (2). Per the plugin's output and the Nagios Core conventions, one would indeed think the host's status should be UNREACHABLE instead of DOWN in this case.
I'm fairly certain this is a bug that was fixed in 4.2.3:
https://github.com/NagiosEnterprises/na ... issues/289
Being that Core 3.5.1 is very old (released in 2013), I think its unlikely that a similar patch will be deployed in a Core 3 release. For old RPM releases though, that's out of the hands of anyone on this forum (depending on your OS and package provider).
Re: Wrong current_state with check_tcp command
Posted: Fri Jan 19, 2018 4:14 pm
by cdienger
Thanks for the input,
@mcapra.
@naitsab can you confirm the version?