Apologies in advance, I am new to Nagios system administration and am trying to figure out which direction to go troubleshooting a problem that occurs sporadically. Nagios version is 5.5.8
Basically, we have some users who use the NRDP interface to run XML code which should trigger the generation of an email alert. As I say, this works most of the time, but will stop working. When this stops sending email, other users who are not using the NRDP component continue to receive email alerts with no problem.
An example of the XML code is:
Code: Select all
<checkresults>
<checkresult type='service' checktype='1'>
<hostname>stage_3</hostname>
<servicename>dev_test_2</servicename>
<state>1</state>
<output>Report is ready</output>
</checkresult>
</checkresults>
It has been reported that when email messages fail to send, the NRDP interface also doesn't return an OK after clicking Submit XML Check Result.
On one recent test where an email failed to send, it did arrive after the machine was rebooted, possibly it is queueing somewhere, but we haven't been able to work out if there is a queue of emails and how it can be monitored.
We have monitored phpmailer.log and maillog but are not sure specifically what to look for.
I apologize for the lack of details and specifics here, but if you can point us in the right direction for logs to check, information to capture, etc.. we will know what to check/capture next time to help narrow this down.
Thanks in advance.
Luke