These tablets report back to the centre using NCSA - we have two hosts receiving data - one sits within the company network (and therefore needs vpn access), whilst the other is within our DMZ.
One of the tablets was having issues, which seemed to relate to load... and having read about Nagios 4.0 and its worker processes, this seemed to be worth a test.
The old problem was that rather than running every 10 minutes, some of the services would be 20-30 minutes between checks... Following the upgrade, this issue appears to be resolved... which is good...
However, I've got an oddity.... some of the reports are not being delivered by NCSA.... but this is only the case for the VPN connected host... the other works without issue.
What I've found in /var/log/messages is the following;
Code: Select all
Oct 23 17:02:57 TABLETNAME nagios: wproc: command: /usr/local/nagios/libexec/submit_check_result_to_vision1 HOSTNAME 'SHOWOSLEVEL' OK '7100-01-02-1150|version=7.1 release=01 ml=02'
Oct 23 17:02:57 TABLETNAME nagios: wproc: host=HOSTNAME; service=SHOWOSLEVEL; contact=(none)
Oct 23 17:02:57 TABLETNAME nagios: wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Oct 23 17:02:57 TABLETNAME nagios: Warning: OCSP command '/usr/local/nagios/libexec/submit_check_result_to_vision1 HOSTNAME 'SHOWOSLEVEL' OK '7100-01-02-1150|version=7.1 release=01 ml=02'' for service 'SHOWOSLEVEL' on host 'HOSTNAME' timed out after 0.00 seconds
Code: Select all
ocsp_command=submit_check_result_to_vision1
And how can it "timed out after 0.00 seconds" ??? (I'm assuming this is related to error 62?)
This was never an issue in Nagios 3.5 (but since the upgrade to 4.0) it affects 4 services on this one host... whilst the other 16 work like clockwork ?
Any suggestions?
Thanks, Malcolm