Service check timed out after 60.01 seconds

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Mahesh786
Posts: 30
Joined: Mon Apr 05, 2021 9:21 am

Re: Service check timed out after 60.01 seconds

Post by Mahesh786 »

Hi Perry,

I have executed the below command and provided the output in privately to you. Please check and let us know what is the problem.


while :; do find /usr/local/nagios/var/ -name "*.*" -not -path "/usr/local/nagios/var/rw/*" | xargs tail -F | grep -Ei "warn|error|fail|unknown|critical|ucprs4apprd05" >> /tmp/loggingit.txt; sleep 1; done

Regards,

Venkata Reddy
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Service check timed out after 60.01 seconds

Post by pbroste »

Hello @Mahesh786

Thanks for sending over the results, we see some weird event NDO-3, database, and other event time/date stamps and want to review the System Profile to see what is going on.

Please PM your updated system profile for us to review.

To send us your system profile.
  • Login to the Nagios XI GUI using a web browser.
  • Click the "Admin" > "System Profile" Menu
  • Click the "Download Profile" button
  • Save the profile.zip file and send via Private Message
Thanks,
Perry
Mahesh786
Posts: 30
Joined: Mon Apr 05, 2021 9:21 am

Re: Service check timed out after 60.01 seconds

Post by Mahesh786 »

Hi Perry,

I have sent profile.zip file details privately. Please check and let us know what is the problem.

Regards,
Venkata Reddy
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Service check timed out after 60.01 seconds

Post by pbroste »

Hello @Mahesh786

Thanks for following up with the System Profile.

We want to make sure that we are looking at the correct issue, so we want to divide it to verify a client issue or server issue.

To do that, we want to take a look at the data directly from the client, leaving out the server by going directly to the NCPA web interface:

Code: Select all

https://ucprs4apprd05:5693
or

Code: Select all

https://10.1.44.55:5693
Check out the > 'Dashboard' then head over to the 'Live Data' > toggle through everything (CPU/Memory|Disks|Interfaces) to make sure that all data is coming through (even Top Processes too).

During the same time open the Browser's Developement Tools > Networking Tab > (reload|refresh) and check/verify for any error events in red.

Let me know how things look from the client-side. If checks are getting hung up we may want to run NRPE checks from the server to see how they look from your 'ucprs4apprd05' client.

Thanks,
Perry
Mahesh786
Posts: 30
Joined: Mon Apr 05, 2021 9:21 am

Re: Service check timed out after 60.01 seconds

Post by Mahesh786 »

Hi Perry,

As checked, when the connection is working we are getting the data and when connection is not working we are unable to get the data and some of the Disks are not getting the data. Screenshot has been attached the same.

after running development tool we got the some errors. Same has been attached.

Please check and let us know what is the issue ASAP.

Regards,
Venkata Reddy
You do not have the required permissions to view the files attached to this post.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Service check timed out after 60.01 seconds

Post by pbroste »

Hello @Mahesh786

Thanks for following up with the details and screenshots. What is odd about this issue is that you stated that the checks will work when the connection is established. But receive WebSocket time-out error messages during the times that we are unable to connect to host.

NCPA script is running Python and want to have you check what version you are running: (my VM test environment has 3.6.8)

Code: Select all

python -V
Typically we see WebSocket issues when SSL handshake fails, but that does not make sense in this case because the checks are functioning and then intermittently stops working. Unless you suspect a routing issue or something within your corp network.

Next; verify that your NCPA services are loaded, active, and not intermittently stopping:

Code: Select all

systemctl status ncpa_listener.service

Code: Select all

systemctl status ncpa_passive.service
Please review the NCPA logs for errors or anything wonky:

Code: Select all

/usr/local/ncpa/var/log/ncpa_listener.log
/usr/local/ncpa/var/log/ncpa_passive.log
And just to check again, is Selinux enabled?

Please let me know the results,
Perry
Mahesh786
Posts: 30
Joined: Mon Apr 05, 2021 9:21 am

Re: Service check timed out after 60.01 seconds

Post by Mahesh786 »

Hi Perry,

I have executed all the commands and attached the output and selinux not enabled on this node also firewall is in inactive state.

Regards,
Venkata Reddy
You do not have the required permissions to view the files attached to this post.
User avatar
pbroste
Posts: 1288
Joined: Tue Jun 01, 2021 1:27 pm

Re: Service check timed out after 60.01 seconds

Post by pbroste »

Hello @Mahesh786

Thanks for following up, we see that you are on an older python version and python cipher modules have been updated since 2.x. Please upgrade to the 3.x version and let us know how things go.

Thanks,
Perry
Locked