Email Notifications Logs + Sendmail + not sending
Re: Email Notifications Logs + Sendmail + not sending
Please see this post http://support.nagios.com/forum/viewtop ... 6&start=10
however i tried to get core config snapshot in which there is 2 option in Action [ 1.Download(806MB) 2. View Output(470KB) ]
I can provide you View Output file i think that will be of no use since it just contains service and host name but there are two type of errors in that file which is as follows.."Warning: Service 'Vlan830 Status' on host 'host.local' has no check time period defined!' (this error is for all services )and Warning: Duplicate definition found for service 'Slash Disk Usage' on host 'LSP01' (config file '/usr/local/nagios/etc/services/LSP01 .cfg', starting on line 231) (this error is for 15-20 services)
here is the tail part of "view output"
Checked 4691 services.
Checking hosts...
Checked 145 hosts.
Checking host groups...
Checked 19 host groups.
Checking service groups...
Checked 8 service groups.
Checking contacts...
Warning: Contact 'lagan' has no host notification time period defined!
Checked 25 contacts.
Checking contact groups...
Checked 20 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 91 commands.
Checking time periods...
Checked 31 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 4343
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
however i tried to get core config snapshot in which there is 2 option in Action [ 1.Download(806MB) 2. View Output(470KB) ]
I can provide you View Output file i think that will be of no use since it just contains service and host name but there are two type of errors in that file which is as follows.."Warning: Service 'Vlan830 Status' on host 'host.local' has no check time period defined!' (this error is for all services )and Warning: Duplicate definition found for service 'Slash Disk Usage' on host 'LSP01' (config file '/usr/local/nagios/etc/services/LSP01 .cfg', starting on line 231) (this error is for 15-20 services)
here is the tail part of "view output"
Checked 4691 services.
Checking hosts...
Checked 145 hosts.
Checking host groups...
Checked 19 host groups.
Checking service groups...
Checked 8 service groups.
Checking contacts...
Warning: Contact 'lagan' has no host notification time period defined!
Checked 25 contacts.
Checking contact groups...
Checked 20 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 91 commands.
Checking time periods...
Checked 31 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 4343
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
Re: Email Notifications Logs + Sendmail + not sending
As a side-note to the current issue, I want to look at that 806MB profile issue. Can you run the following?
ls -lR > output.txt
and PM it to me? I want to see what might be making the download so huge.
ls -lR > output.txt
and PM it to me? I want to see what might be making the download so huge.
Former Nagios employee
Re: Email Notifications Logs + Sendmail + not sending
tmcdonald wrote:As a side-note to the current issue, I want to look at that 806MB profile issue. Can you run the following?
ls -lR > output.txt
and PM it to me? I want to see what might be making the download so huge.
Code: Select all
$ du -sh /usr/local/nagios/etc
3.4M /usr/local/nagios/etcback to the issue do you need whole config snap shot or part of it ...?
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Email Notifications Logs + Sendmail + not sending
Just the actual configuration files, how large is the snapshot?
Re: Email Notifications Logs + Sendmail + not sending
it is 107KB in .tar.gz format...how large is the snapshot?
also the issue with some services(observed mainly memory related) are not sending email regarding that when services(for now i am considering only 1 particular services) when it first came in GUI (Services Status/ Service=Critical) it is observed that they are already having acknowledge and comment sign PFA
You do not have the required permissions to view the files attached to this post.
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Email Notifications Logs + Sendmail + not sending
Okay, so that acknowledgement will cause alerts not to be sent out, in addition, if it is a sticky acknowledgement notifications will not be sent out until the service fully recovers. Is it possible that a team member is constantly acknowledging them? You can check this via Home > Acknowledgements.
Re: Email Notifications Logs + Sendmail + not sending
FYI
nagios server uptime : 39days
is nagios service restarted or stop in the middle for a while from the last boot till now? NO
Persistent Comment=sticky ACK
if the services has Persistent Comment where to see ? will it be shown in Home > Acknowledgements if that is the case then for Service which i have posted in image i am not able to see that host name nor services. however i have pasted status.dat(/usr/local/nagios/var/status.dat) of this service please see below
also what does 'problem_has_been_acknowledged=1' above mean ? is it persistent comment ?
nagios server uptime : 39days
is nagios service restarted or stop in the middle for a while from the last boot till now? NO
Persistent Comment=sticky ACK
if the services has Persistent Comment where to see ? will it be shown in Home > Acknowledgements if that is the case then for Service which i have posted in image i am not able to see that host name nor services. however i have pasted status.dat(/usr/local/nagios/var/status.dat) of this service please see below
Code: Select all
servicestatus {
host_name=Prod_2
service_description=Memory Usage
modified_attributes=1
check_command=check_xi_service_snmp_win_storage!-C snmpname -m "Real Memory" -r -w 85 -c 95 -f!!!!!!!
check_period=
notification_period=24x7
check_interval=15.000000
retry_interval=5.000000
event_handler=
has_been_checked=1
should_be_scheduled=1
check_execution_time=0.218
check_latency=0.055
check_type=0
current_state=2
last_hard_state=2
last_event_id=273589
current_event_id=273663
current_problem_id=56078
last_problem_id=56068
current_attempt=3
max_attempts=3
state_type=1
last_state_change=1390577982
last_hard_state_change=1390577982
last_time_ok=1367654673
last_time_warning=1390577082
last_time_unknown=1385187866
last_time_critical=1390816482
plugin_output=Real Memory: 97%used(31759MB/32768MB) (>95%) : CRITICAL
long_plugin_output=
performance_data='Real_Memory'=31759MB;27853;31130;0;32768
last_check=1390816482
next_check=1390817382
check_options=0
current_notification_number=1
current_notification_id=1298950
last_notification=0
next_notification=0
no_more_notifications=0
notifications_enabled=1
active_checks_enabled=1
passive_checks_enabled=1
event_handler_enabled=1
problem_has_been_acknowledged=1
acknowledgement_type=2
flap_detection_enabled=1
failure_prediction_enabled=1
process_performance_data=1
obsess_over_service=1
last_update=1390816734
is_flapping=0
percent_state_change=0.00
scheduled_downtime_depth=0
_XIWIZARD=0;solaris
}
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Email Notifications Logs + Sendmail + not sending
It will be in the status.dat , something like:vnc786 wrote:if the services has Persistent Comment where to see ? will it be shown in Home > Acknowledgements if that is the case then for Service which i have posted in image i am not able to see that host name nor services.
Code: Select all
servicecomment{
host_name=192.168.5.1
service_description=DHCP
entry_type=4
comment_id=706
source=0
persistent=0
entry_time=1387475669
expires=0
expire_time=0
author=Nagios Administrator
comment_data=Problem is acknowledged
}It means the service problem has been acknowledged, which may also have a comment.vnc786 wrote: also what does 'problem_has_been_acknowledged=1' above mean ? is it persistent comment ?
Re: Email Notifications Logs + Sendmail + not sending
we have some services which stay in Critical or Warning for long times(1 months or 2 months). talking about the the service in the image(Memory Usage) was for very long time, in View History it was not showing Soft state, there might be the reason that it was showing/Holding ACK sign but i removed the ACK sign from Advance Tab after that i got email...
Also in above thread i have mention 1 attachment which contain State history in which particularly if service goes in Critical State from Warning State particularly ...? what do u think of that...since it is getting recovery
Can you explain me about Persistent Comment and sticky Ack if possible with example ?
Also in above thread i have mention 1 attachment which contain State history in which particularly if service goes in Critical State from Warning State particularly ...? what do u think of that...since it is getting recovery
Can you explain me about Persistent Comment and sticky Ack if possible with example ?
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Email Notifications Logs + Sendmail + not sending
I don't believe that you really can remove an ACK from a problem once it has been ack'd and cause messages to send out again. So if this has been in place for a long time and someone else already ack'd it, I would never expect messages to go out again. I would suggest submitting a passive ok state instead, although that can mess with your reporting a bit. (I could be entirely wrong here too)we have some services which stay in Critical or Warning for long times(1 months or 2 months). talking about the the service in the image(Memory Usage) was for very long time, in View History it was not showing Soft state, there might be the reason that it was showing/Holding ACK sign but i removed the ACK sign from Advance Tab after that i got email...
Switching between warning, critical, and unknown without a recovery to OK, will not cause any change in length down, retry checks, or notifications. They are all considered a down state and will stay keep most of the same settings and flags, only switching to OK will reset them.Also in above thread i have mention 1 attachment which contain State history in which particularly if service goes in Critical State from Warning State particularly ...? what do u think of that...since it is getting recovery
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.