Nagios Support Forum

Posted: **Wed Mar 06, 2013 3:59 am**

Ok here it is ... postfix working ok with the test options following this article. Iv got mail

... so the next is nagios ... i configured like i sad yesterday ...i have template.cfg set for router host like this ...

Code: Select all

define host{
        name                    in-router
        use                     generic-host
        check_period            24x7
        check_interval          5
        retry_interval          1
        max_check_attempts      10
        check_command           check-host-alive
        notification_period     24x7
        notification_interval   30
        notification_options    d,r
        contact_groups          admins,etest
        hostgroups              routers-in
        register                0
}

then i have contact.cfg configure like this

Code: Select all


define contact{
        contact_name                    ivan
        use                             generic-contact
        alias                           test
        email                           [email protected]
}

define contactgroup{
        contactgroup_name       etest
        alias                   mail proba
        members                 ivan
}

AND GUESS WHAT !!!!!!!!!!! IT WORKS !!!!!!!!!!!!!!!!!!!!!!!!!

and i just shutdown one of the ports ... then i start writing that something is wrong ... and just before i submit this ... there it is ... MAIL !

hahahaha this was fun ... the filing when you get something to work

tnx to all for the help and support that you gave me !!!

now im trying to figure out what is taking nagios so long to see that something is down or critical ...
correct me if im wrong ... for example we have some Router and suddenly the port go down ... nagios knows about it or wait for the service period to check that ? and if it knows about it(immediately) then the refresh period of the nagios frontend have to be passed. If not then the service period for check have to be changed or i dont have to change nothing because it works great

... im just trying to get the max of this lovely system !!!!!

My next think is how to get details about the service that is down ... for now i have just this

Notification Type: PROBLEM
Host: RouterCiscoTest
State: DOWN
Address: 10.20.20.56
Info: CRITICAL - Host Unreachable (10.20.20.56)

Date/Time: Wed Mar 6 09:44:41 CET 2013

im trying to find out how to get for example to tell me that FastEthernet0/0 is down ...

Posted: **Wed Mar 06, 2013 8:46 am**

Hi MPIvan,

Glad to hear the Postfix is working!

Not sure if your other questions should be posted on a new thread or not, but in the mean time I have some replies for you.

I think I understood your question right, asking why it takes so long for your notification to be sent out after the host goes into a non-up state.

Code: Select all

        retry_interval          1
        max_check_attempts      10

So this means Nagios will check the host every one minute 10 times (10 minutes total) before it determines that the host is in fact down, at which time it will send out a notification.

Regarding the other question, about the details of a service which is down, below is what you posted from what I assume is the email notification:

Info: CRITICAL - Host Unreachable (10.20.20.56)

By the look of it, the notification is saying that the host itself is down. Nagios will not send notifications about services if the host itself is down (e.g. If the host machine has been shutdown, Nagios will only notify you that host is unreachable, instead of also notifying you that all the services are down because that would be expected if Nagios can't even contact the host).

It may be worth checking that Nagios can reach the host and vice-versa, and once the host is in an UP state, you will receive individual notifications about each service if it is a non-OK state.

Hope this helps, please say if anything has been misunderstood on my part.

Thank you.

Kind Regards,

Gary Shergill

Posted: **Wed Mar 06, 2013 12:04 pm**

@gshergill: Concise and informative as always, thanks for the help!

@MPIvan: Good news. Glad to hear postfix is humming along and sending out those alerts. If your questions concerning postfix are wrapped up, it is best to start a new thread for new, unrelated questions. Wrap up any current questions with gshergill and I, and then start a new thread for future support requests. Let me know when the thread is lock-ready.

Posted: **Thu Mar 07, 2013 6:52 am**

Tnx @gshergill and @abrist ... although i love to close this thread i have some questions to ask ... if i understend the @gshergill well ... you telling me that i have put the or set the contact to send me an e-mail if the host is down ( like all services, for example if i have router and all interfaces are down and the nagios have no access to that host). That how i get you

... so i read something and here is my next thing ...

Code: Select all

define contact{
        contact_name                            ivan
#       use                                     generic-contact
        alias                                   contacttest
        email                                   [email protected]
        host_notification_commands              notify-host-by-email
        host_notification_options               d,u,r
        host_notification_period                24x7
        service_notification_commands           notify-service-by-email
        service_notification_options            w,u,c,r
        service_notification_period             24x7
}

yes use is comment
and in my router.cfg file i have the host like this ....

Code: Select all

define host{
        use                     router-in
        host_name               RouterCiscoT
        alias                   Test Cisco SNMP
        address                 10.20.20.56
        _SNMPCOMMUNITY          CiscoTest
        contacts                ivan
}

and i have shutdown and i have test also with unplugging the cable of the fa0/0 interface and i have no mail

..

and here is what i read

service_notification_commands: This defines the command or commands to
be run when a state change on a service prompts a notification for this contact. In
this case, we're going to e-mail the contact the results with a predefined command
called notify-service-by-email.
ff service_notification_options: This specifies the different kinds of service
events for which this contact should be notified. Here, we're using w,u,c,r, which
means we want to receive notifications about the services entering the WARNING,
UNKNOWN, or CRITICAL states, and also when they recover and go back to being in
the OK state.

Posted: **Thu Mar 07, 2013 7:24 am**

Hi MPIvan,

I think there may be a slight misunderstanding.

My first question is in your host definition:

Code: Select all

        _SNMPCOMMUNITY          CiscoTest

I'm not sure that works... I could be wrong though.

Could you run the following command please to check your configuration?

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

I was asking to make sure your Nagios server can actually reach the host 10.20.20.56.
The following command should give you the answer:

Code: Select all

/usr/local/nagios/libexec/check_ping -H 10.20.20.56 -w 100.0,20% -c 100.0,60%

Do you restart nagios after you change the config files?

Kind regards,

Gary Shergill

Posted: **Fri Mar 08, 2013 5:47 am**

Hi gshergill, ... everything work well ... _SNMPCOMMUNITY CiscoTest works great ... im defining different community string for different hosts ... for routers is ciscotest for switches will be switchtest example. and for the checking config, yes im running all the time before i restart the nagios and it going well 0 warning 0 errors ... and yes the nagios can reach the cisco host ... and yes the sending of mail is going well when i unplug the cable of the cisco router ... so as you say before if the host is down on the side of the nagios (example NAGIOS <-------->fa0/0 ROUTER fa0/1<----------->SWITCH<---------->PC) the nagios will send me a mail that the Router is down that is if the fa0/0 is down and the message is that the Host ROUTER CISCO IS DOWN ... but i want to make, if for example, the port fa0/1 is down ... the router will still have communication with the nagios server and nagios can send me a mail (service mail) that fa0/1 is down and not whole router as HOST IS DOWN. I like the specific service NAGIOS <-------->fa0/0 ROUTER fa0/1<----------->SWITCH<---------->PC

so as i did previous i dont know where im doing wrong. I hope you understand me

if not, @abrist im sorry

we gonna have to delay the locking of this thread

Posted: **Fri Mar 08, 2013 12:12 pm**

This is most likely happening because the host check is running before the service check. Also because nagios is using the IP bound to fa0/0 it doesn't have an opportunity know that just the port is down opposed to the whole host. I would present the question of, if nagios is communicating on a network that only has one link to the cisco device, and that link drops, how is nagios supposed to tell that just that port is down opposed to the entire host? As far as it is concerned, it cannot communicate with it at all, and cannot verify what exactly is going on.

In addition to that, once nagios detects that the host or parent is down, it will not run service checks, to avoid sending many emails to notification recipients. Does this make sense, for the fa0/0 being disconnected?

As for if fa0/1 is disconnected, again being the only connection that nagios has to the switch\pc on the other side of the router. Nagios will understand that just the port is down and likely any other devices behind that, however it would have no way to communicate with the pc to alert it to the issue. This is partially why enterprise networks will have duplication throughout, if one thing goes down, there is a second to still allow the service to function.

If you wish to have nagios be able to tell when fa0/0 is down, you would need to enable failover and have a second port connected with the same IP, so that nagios does not see a difference.Hope that helps! I will admit, I was a little lost with the formatting of text there.

Posted: **Mon Mar 11, 2013 9:12 am**

Well i think we can close for now this thread. I don't know what happens but it start to work ... and yes the principle is like that ... if the nagios have no access to the host or in our example if the fa0/0 is down he will send mail as host is down ... and if fa0/1 is down will send that fa0/1 is down. That i understand ... i dont know why when i try and when i disconnect the fa0/1 don't send me a mail, and now it works perfect .... i have nothing change but it works

so tnx for all for the support and help ... !!!! @sreinhardt tnx very much, all of you, you help me hugely !

Posted: **Mon Mar 11, 2013 9:41 am**

Great, thank you for letting us know it is working MPIvan! Closing thread, if you run into a problem with this again please open a new one and reference this thread as a URL.

Nagios Support Forum

Postfix

Re: Postfix

Re: Postfix

Re: Postfix

Re: Postfix

Re: Postfix

Re: Postfix

Re: Postfix

Re: Postfix

Re: Postfix