Page 1 of 2

nagios.cmd changes

Posted: Mon Mar 04, 2019 10:50 am
by onegative
G 'Day Nagios XI Support,

I am running into situations where my nagios.cmd file is changing from a pipe to a regular file.
What could cause such a change in the file? How do I troubleshoot this issue? Once it occurs then setting up Scheduled Downtimes seems to have trouble and I am forced to restart nagios core manually.

Any suggestions?

Leave on Friday

Code: Select all

[root@dcom-nagiosxi-p1 rw]# ll
total 0
prw-rw---- 1 nagios nagcmd 0 Mar  1 07:06 nagios.cmd
srw-rw---- 1 nagios nagcmd 0 Mar  1 07:04 nagios.qh
Return on Monday

Code: Select all

[root@dcom-nagiosxi-p1 rw]# ll
total 4
-rw-r--r-- 1 nagios nagcmd 136 Mar  4 05:01 nagios.cmd
srw-rw---- 1 nagios nagcmd   0 Mar  4 05:00 nagios.qh
Let me know and thanks,
Danny

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 2:09 pm
by scottwilkerson
Do you have any scripts that write commands directly to the nagios.cmd ?

Do you have your XI server setup to receive SNMP Traps?

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 3:06 pm
by onegative
Hey Scott,

No I do not have any external scripts writing to the pipe.

Yes I do have snmp traps using snmptt enabled but not utilizing it in any production manner. There might be some traps being sent to the server but unsure.

Code: Select all

[nagios@dcom-nagiosxi-p1 ~]$ ps -ef | grep snmptt
root      5383     1  0 10:48 ?        00:00:00 /usr/bin/perl /usr/sbin/snmptt --daemon
snmptt    5384  5383  0 10:48 ?        00:00:00 /usr/bin/perl /usr/sbin/snmptt --daemon
nagios   12011 30673  0 12:03 pts/0    00:00:00 grep --color=auto snmptt

[nagios@dcom-nagiosxi-p1 ~]$ grep snmptt /etc/group
nagios:x:20954:apache,nagios,snmptt
nagcmd:x:20961:apache,nagios,snmptt
snmptt:x:989:

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 3:14 pm
by scottwilkerson
Can you run the following and show the results

Code: Select all

grep nagios.cmd /usr/local/bin/snmptraphandling.py

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 3:25 pm
by onegative

Code: Select all


[root@dcom-nagiosxi-p1 rw]# grep nagios.cmd /usr/local/bin/snmptraphandling.py
        if os.path.exists('/usr/local/nagios/var/rw/nagios.cmd') and stat.S_ISFIFO(os.stat('/usr/local/nagios/var/rw/nagios.cmd').st_mode):
                output = open('/usr/local/nagios/var/rw/nagios.cmd', 'w')


Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 3:40 pm
by scottwilkerson
That looks like it is supposed to.

Hmm, this is a bit of a mystery, usually this happens if something tries to write to the file in a brief time when the nogios process isn't running.

How frequently are you seeing this?

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 3:43 pm
by onegative
Often...I wonder if this is happening when someone is trying to a Scheduled Downtime while someone else is making a configuration change?

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 3:59 pm
by ssax
You don't have any cron jobs or anything running on the backend that do any sort of automated host management or use any of the external commands, correct? Or any custom event handlers or anything?

See here for examples:

https://assets.nagios.com/downloads/nag ... ernalcmds/

That snmptraphandling.py is the usual culprit in this but since yours has the S_ISFIFO, it is not the issue.

Do you use NRDP for passive checks at all?

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 4:27 pm
by onegative
I have no cron jobs that perform any type of host mgmt or external commands especially to the nagios.cmd file.

My environment is nearly all passive monitoring using the NCPA agents on remote hosts...sending their results to the NDRP API on the Nagios XI Server.
Statistics.png

Re: nagios.cmd changes

Posted: Mon Mar 04, 2019 5:05 pm
by npolovenko
@onegative, What is the output of:
ls -l /usr/local/nagios/var/
How often does it take from the moment you restart the nagios service till the cmd file gets overridden with incorrect permissions? On the second screenshot, I see that both files: nagios.cmd and nagios.qh were modified at 5 am. Was there any security scan running at that time or anything else you could think of?

Try restarting the Nagios process and then Apply configuration in the GUI and let me know if that changes permissions on the cmd file?
Also, please check how much time passes before the permissions change again?