Monitoring agent starts automatically after being stopped

miwalls · Post by **miwalls** » Fri Aug 31, 2018 9:40 am

I am having an issue with a secondary nagiosxi server for high availability. My goal is to turn off the monitoring agent until the primary server goes offline. Once the primary is offline, the floating ip switches to the secondary and is set as the source address and then it will turn on the monitoring agent off for the primary (if possible) and flip it on for the secondary. The problem is that the monitoring agent randomly decides to start back up even after stopping it manually. Are there any suggestions to fix this? I am controlling the stop and start of the nagios monitoring agent using systemd commands. The logs don't show anything for nagios.log. I can't seem to find what log might show something trying to restart the monitoring agent. I am assuming that it is being restarted by one of the systemd nagios sessions.

Code: Select all

[1535724278] Caught SIGTERM, shutting down...
[1535724278] Caught SIGTERM, shutting down...
[1535724278] Caught SIGTERM, shutting down...
[1535724278] Successfully shutdown... (PID=43427)
[1535724278] ndomod: Shutdown complete.
[1535724278] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1535724368] Nagios 4.4.2 starting... (PID=62743)
[1535724368] Local time is Fri Aug 31 09:06:08 CDT 2018
[1535724368] LOG VERSION: 2.0
[1535724368] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1535724368] qh: core query handler registered
[1535724368] qh: echo service query handler registered
[1535724368] qh: help for the query handler registered

Thanks for the help ahead of time.

npolovenko · Post by **npolovenko** » Fri Aug 31, 2018 11:11 am

Hello, @miwalls. Have you submitted any Apply Configuration commands after you stopped the nagios process? Or any external commands? Has the server been rebooted?

miwalls · Post by **miwalls** » Wed Sep 05, 2018 6:16 am

I actually figured out the issue. I setup high availability using the typical pcsd with a virtual ip resource for both incoming and source and setup a systemd resource for nagios. The way I sync the primary and secondary server is with backups sent through ssh on the primary then a restore on the secondary. The problem arises when the secondary server gets the backup schedules for the primary server. It attempts or successfully runs the backups then it flips the nagios systemd unit back on and then goes about its happy way alerting me (often in the middle of the night) about stuff that isn't happening. Would you have any recommendations on how to fix this? I was thinking I could delete the backup_xi.sh script entirely after the restore nagiosxi cron job runs on the secondary then it can't do anything. I also think a somewhat important change to the backup and possibly restore script should be to check if all the services are running before hand then restore it to the state it was at before the backup or restore.

npolovenko · Post by **npolovenko** » Wed Sep 05, 2018 11:16 am

@miwalls, Are you using the restore_xi.sh script on the secondary failover server? Could you modify the script to prevent it from automatically starting the nagios process after the restore by commenting out this line:

$BASEDIR/manage_services.sh start nagios

miwalls · Post by **miwalls** » Thu Sep 06, 2018 9:28 am

Yeah what I ended up doing was adding a sed command to replace the start/restart nagios commands from backup_xi.sh and restore_xi.sh before and after the restore. It seems on restore that these files are changed back to default. The issue is now fixed. Thanks

npolovenko · Post by **npolovenko** » Thu Sep 06, 2018 10:37 am

@miwalls, I see, thanks for the tip! I will be closing this thread as resolved.

Nagios Support Forum

Monitoring agent starts automatically after being stopped

Monitoring agent starts automatically after being stopped

Re: Monitoring agent starts automatically after being stoppe

Re: Monitoring agent starts automatically after being stoppe

Re: Monitoring agent starts automatically after being stoppe

Re: Monitoring agent starts automatically after being stoppe

Re: Monitoring agent starts automatically after being stoppe