All hosts are down - icmp check failing

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
Shwele
Posts: 47
Joined: Tue Oct 03, 2017 3:00 am

All hosts are down - icmp check failing

Post by Shwele »

Host itself is down with this output:

Code: Select all

Warning: This plugin must be either run as root or setuid root.
All the services are up and running with OK.

Permissions for that check are:

Code: Select all

-rwxrwxr-x 1 nagios nagios 213856 Oct 24 16:02 check_icmp*
Anything I'm missing here?
I'm trying to check with Azure server and also tried with another server that we have, same error.

Thanks
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: All hosts are down - icmp check failing

Post by npolovenko »

Hello again, @Shwele.
Please go to

Code: Select all

cd /usr/local/nagios/libexec/
and run
chmod u+s check_icmp
After that restart nagios

Code: Select all

service nagios restart
The plugin should be working now.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
Shwele
Posts: 47
Joined: Tue Oct 03, 2017 3:00 am

Re: All hosts are down - icmp check failing

Post by Shwele »

Why hello hello, my troubleshooting buddy @npolovenko

Its not, that is the problem.

Forgot to mention that when I visited hosts, it even says the following, which I did:

Code: Select all

To run as root, you can use a tool like sudo.
To set the setuid permissions, use the command:
chmod u+s yourpluginfile
check_icmp: Failed to obtain ICMP socket: Operation not permitted
As you can see the permissions are set:
nagios.png
After doing force immediate check, its still down.

Is there any public server I could put in order to check, maybe its my providers fault for not allowing ICMP to go trough their routers?
You do not have the required permissions to view the files attached to this post.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: All hosts are down - icmp check failing

Post by npolovenko »

@Shwele,
Why hello hello, my troubleshooting buddy @npolovenko
That's right, next time i should probably say Chao, prijatelju! ;)


Are you able to just run a ping command from your nagios server?

Code: Select all

ping server_IP
It could be that the servers you're trying to check have some firewall restrictions.

Also, if you go to /usr/local/nagios/libexec/ there should be another plugin called ./check_ping
Let's see if it works:

Code: Select all

cd /usr/local/nagios/libexec/
./check_ping -H Target_Server_IP -w 1,10% -c 2,20%
If it works we could modify the template in XI to swap check_icmp for check_ping. Poth plugins are doing essentially the same function.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
Shwele
Posts: 47
Joined: Tue Oct 03, 2017 3:00 am

Re: All hosts are down - icmp check failing

Post by Shwele »

Ćao prijatelju! @npolovenko

Global admins sure have it easy, learning where I'm from to get extra friendly. :oops:

I am able to ping server on local hosting, but I am unable to ping Azure server. From what I've researched, they have this port 1 disabled, making it non pingable.

That fix with changing icmp with ping, Ill check if that fixes the issue for that server at least.

Azure server output from that ping command you've proposed:

Code: Select all

PING CRITICAL - Packet loss = 100%|rta=2.000000ms;1.000000;2.000000;0.000000 pl=100%;10;20;0
server in hosted in my country:

Code: Select all

PING OK - Packet loss = 0%, RTA = 0.44 ms|rta=0.444000ms;1.000000;2.000000;0.000000 pl=0%;10;20;0
Update:
After moving check_icmp and coping check_ping with its name, it still appears down in nagios interface, even tho it works from command line.

Ping command in nagiosxi:

Code: Select all

$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
ARG1: 100.0,20%
ARG2: 500.0,60%
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: All hosts are down - icmp check failing

Post by npolovenko »

Hello, @Shwele.
I definitely did not try to learn your location on purpose. I came across the website you posted when I was working on another thread and because I'm fluent in your language I decided to say hello :) However if you have a privacy concern let me know and I'll clean up my previous posts...My apologies.
Update:
After moving check_icmp and coping check_ping with its name, it still appears down in nagios interface, even tho it works from command line.
Did you modify the template? If you go to Configure/Core Configuration Manager. Then you click on the host. You need to check what host template is being used.
template.png
Usually it's linux_server
Then you need to navigate to Templates/Host Templates in the left column, click on the template and change the check command. After it's done click on apply configuration.
You don't need to move/rename or modify plugin files themselves.

Now, if the Azure servers don't support ping commands you could use yet another template with another command of your choice for the initial host-health check. You could use check_tcp, or check_http which is going to access the server's webpage with http and if the webpage is UP the status for the host will be OK.
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
Shwele
Posts: 47
Joined: Tue Oct 03, 2017 3:00 am

Re: All hosts are down - icmp check failing

Post by Shwele »

Hellows @npolovenko

Oh so its like that. Awesome, what website btw? You can PM too. :D
I got no issues with that, it was just a joke on my part, don't worry about it, it doesn't bother me. Its even nice. :)
PS: privacy is always a concern ;)
PSS: its ok to leave it, I don't mind it, really :D

Ok that did the trick! Modified template for adequate check, from check-host-alive to check-host-alive-http . Hosts are now alive and well.

Tho now ping is causing issues with local server, could be by our meddling before :1

In NagiosXI web interface service as unknown showing this:

Code: Select all

CRITICAL - Could not interpret output from ping command
When I call ping command as nagios has in Status Information, here is the output:

Code: Select all

/usr/bin/ping -n -U -w 10 -c 5 12.34.56.789
PING 12.34.56.789 (12.34.56.789) 56(84) bytes of data.
64 bytes from 12.34.56.789: icmp_seq=1 ttl=63 time=10.4 ms
64 bytes from 12.34.56.789: icmp_seq=2 ttl=63 time=0.454 ms
64 bytes from 12.34.56.789: icmp_seq=3 ttl=63 time=5.82 ms
64 bytes from 12.34.56.789: icmp_seq=4 ttl=63 time=0.551 ms
And here is /usr/local/nagios/libexec/check_ping -H 12.34.56.789 -w 3000.0,80% -c 5000.0,100% -p 5

Code: Select all

PING OK - Packet loss = 0%, RTA = 0.75 ms|rta=0.747000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
Also Im seeing bunch of checks that have -local in it... for example disk, so I should find online some nagios service for checking disks, since this one does it locally only, without getting hostname from host itself?
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: All hosts are down - icmp check failing

Post by npolovenko »

@Shwele
In NagiosXI web interface service as unknown showing this:
CODE: SELECT ALL
CRITICAL - Could not interpret output from ping command
That's a weird issue. Can you change the user to Nagios in the command line:
su - nagios
And run same commands:

Code: Select all

/usr/local/nagios/libexec/check_ping -H 12.34.56.789 -w 3000.0,80% -c 5000.0,100% -p 5
and this:

Code: Select all

/usr/bin/ping -n -U -w 10 -c 5 12.34.56.789
Do you get any errors/permission issues like that? If yes, you might need to run chmod u+s /bin/ping.

Also in XI go to Core Configuration Manager, click on host that has ping errors. And you can click on "Run Check Command".
screenshot-192.168.4.172-2017-10-31-12-58-43-575.png
Does that work ok? Make sure your command is defined as on my screenshot.
Also Im seeing bunch of checks that have -local in it... for example disk, so I should find online some nagios service for checking disks, since this one does it locally only, without getting hostname from host itself?
Can you clarify? Do you mean that all services have a config name "localhost" in Core Configuration Manager? Services will have the same name as a host. Are you asking how to monitor disk, ect on another server? Are your servers running windows or linux?
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
Shwele
Posts: 47
Joined: Tue Oct 03, 2017 3:00 am

Re: All hosts are down - icmp check failing

Post by Shwele »

Yea, looks like ping wasn't changed with, I did run chmod u+s /usr/bin/ping and now it went smoothly.

Now its all green and well, looks like default permissions didn't fix that. Btw why don't you implement that chmod in nagiosxi installation, due that it saves you troubleshooting and dealing with it? There is even ICMP issue as well, when you install it as root you should have permission to do so?
Can you clarify? Do you mean that all services have a config name "localhost" in Core Configuration Manager? Services will have the same name as a host. Are you asking how to monitor disk, ect on another server? Are your servers running windows or linux?
No, I mean I copied service I wanted from localhost to reuse them, changed what host it is checking. But I'm getting same output for disk usage, which is from localhost, so I'm guessing checks were for server that nagios is on.

Like, I used copy, then renamed service check to hostname_checkname and added previously created host. But it seems these services are for local only (some of them?) and it has to be done over NRPE?
Exactly, one of the things is disk, mysql checks, apache, etc etc, I think all so far are checking localhost, not the host I want it to.
Linux server for all instances that we have, already up and running.

I tried going over wizard for another server, to see how that goes and it seems like the only option is to install NRPE on that server if I want it to be stalked by Nagios? :x
And NRPE needs it clean, so you see my troubles... tho SNMP seems like valid option, will write ticket with issues with it.

Ill try to elaborate few things and make a new topic, due that this one is getting roundabout.
Ill pm you with current state of hosts, since now its no longer dummy hosts.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: All hosts are down - icmp check failing

Post by npolovenko »

Hi @Shwele,
Btw why don't you implement that chmod in nagiosxi installation, due that it saves you troubleshooting and dealing with it? There is even ICMP issue as well, when you install it as root you should have permission to do so?
Usually, all standard plugins in XI work right out of the box without the need to change permissions. Not sure what happened, but we'll do some more QA tests to identify the issue and fix it in a future release.
Like, I used copy, then renamed service check to hostname_checkname and added previously created host. But it seems these services are for local only (some of them?) and it has to be done over NRPE?
Yeah, exactly. Many services, such as check_disk, check_processor, check_apache cannot run over standard HTTP protocol, without installing some additional agent on the remote server. That's where agents like NCPA or NRPE come into place. They run those plugins locally as a sudo and send results back to Nagios using various protocols.

I'm going to close this thread as resolved but feel free to open another one if you encounter other problems.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked