I understand your confusion. You misunderstand the log format. Please note that you don't actually have any contactgroups named "PortalWeb1," you're probably thinking of the contactgroup you have named "portaladmin."
As I mentioned earlier, when you have a NOTIFICATION log entry, you will have repeated entries for each and every contact that gets notified based upon an alert. In your log:
Code: Select all
[1438574804] SERVICE ALERT: PortalWeb1;CPU Load;CRITICAL;SOFT;1;CPU Load 99% (5 min average)
[1438574924] SERVICE ALERT: PortalWeb1;CPU Load;CRITICAL;SOFT;2;CPU Load 99% (5 min average)
[1438575044] SERVICE ALERT: PortalWeb1;CPU Load;CRITICAL;HARD;3;CPU Load 99% (5 min average)
[1438575044] SERVICE NOTIFICATION: nagiosadmin;PortalWeb1;CPU Load;CRITICAL;notify-service-by-email;CPU Load 99% (5 min average)
[1438575644] SERVICE ALERT: PortalWeb1;CPU Load;OK;HARD;3;CPU Load 0% (5 min average)
[1438575644] SERVICE NOTIFICATION: nagiosadmin;PortalWeb1;CPU Load;OK;notify-service-by-email;CPU Load 0% (5 min average)
Translates to:
ALERT 1 SOFT CRITICAL
ALERT 2 SOFT CRITICAL
ALERT 3 transition to HARD CRITICAL (raise notification)
NOTIFICATION to nagiosadmin CRITICAL
ALERT 1 HARD OK
NOTIFICATION to nagiosadmin OK
What you would like to see it sounds like is something more like this:
Code: Select all
[1438574804] SERVICE ALERT: PortalWeb1;CPU Load;CRITICAL;SOFT;1;CPU Load 99% (5 min average)
[1438574924] SERVICE ALERT: PortalWeb1;CPU Load;CRITICAL;SOFT;2;CPU Load 99% (5 min average)
[1438575044] SERVICE ALERT: PortalWeb1;CPU Load;CRITICAL;HARD;3;CPU Load 99% (5 min average)
[1438575044] SERVICE NOTIFICATION: rcarter;PortalWeb1;CPU Load;CRITICAL;notify-service-by-email;CPU Load 99% (5 min average)
[1438575044] SERVICE NOTIFICATION: jwells;PortalWeb1;CPU Load;CRITICAL;notify-service-by-email;CPU Load 99% (5 min average)
[1438575044] SERVICE NOTIFICATION: joshtate2001;PortalWeb1;CPU Load;CRITICAL;notify-service-by-email;CPU Load 99% (5 min average)
[1438575044] SERVICE NOTIFICATION: jtate;PortalWeb1;CPU Load;CRITICAL;notify-service-by-email;CPU Load 99% (5 min average)
[1438575644] SERVICE ALERT: PortalWeb1;CPU Load;OK;HARD;3;CPU Load 0% (5 min average)
[1438575644] SERVICE NOTIFICATION: rcarter;PortalWeb1;CPU Load;OK;notify-service-by-email;CPU Load 0% (5 min average)
[1438575644] SERVICE NOTIFICATION: jwells;PortalWeb1;CPU Load;OK;notify-service-by-email;CPU Load 0% (5 min average)
[1438575644] SERVICE NOTIFICATION: joshtate2001;PortalWeb1;CPU Load;OK;notify-service-by-email;CPU Load 0% (5 min average)
[1438575644] SERVICE NOTIFICATION: jtate;PortalWeb1;CPU Load;OK;notify-service-by-email;CPU Load 0% (5 min average)
Understand?
Now that I've hopefully clarified how the logging works can you post your service definition for CPU Load on host PortalWeb1? Also if it has a "use" line indicating it has upstream templates, please post those as well. I think this is all a simple misconfiguration in your host and/or service. As I mentioned, there is nothing wrong with your contacts' and contactgroups' definitions, and your log makes it appear that the services just aren't configured properly to alert the contactgroups you're trying to hit.