Page 1 of 1

rsync downtime between Primary and Secondary Nagios servers

Posted: Thu Feb 16, 2017 5:13 pm
by nagmoto
I opened similar question in R1 a while back. May be due to my Nagios server version changed, I found rsync /var/nagios/retention.dat from primary to seconday doesn't work anymore.

Current setup
Primary: Naigos server 4.1.1 in EST timezone.
Secondary: Nagios server 4.2.4 in CST timezone, active checks on all host in Primary but with alert notification disabled.
I am using shell script wrap around rsync command to sync both object/*.cfg, servers/*.cfg and /var/nagios/retention.cfg
Also nagios.cfg has following setting to speed up(from 90 to 1 min) the retention sync from memory into retention.dat file.

Code: Select all

[me@nagios01 nagios]$ egrep '^retention|^state_retention_file'  nagios.cfg
state_retention_file=/var/nagios/retention.dat
retention_update_interval=1
[me@nagios01 nagios]$
The current problem is that when I put a host on primary Nagios,after 5 minutes with a rsync of retention.dat
the secondary won't display the same host in downtime mode.

Q1: Is retention.dat the only file I need to sync between two Nagios server ?
Q2: Do I need to sync status.dat also ?

R1: https://support.nagios.com/forum/viewto ... t=downtime

Re: rsync downtime between Primary and Secondary Nagios serv

Posted: Thu Feb 16, 2017 5:58 pm
by avandemore
I'm not sure the impact of retention.dat version difference would have. I do know some of the internals have changed in Core that may be related to this. If it was working previously, the first thing I would do is make sure the version is the same. Likely to run into that elsewhere anyway.

Re: rsync downtime between Primary and Secondary Nagios serv

Posted: Thu Feb 16, 2017 9:04 pm
by nagmoto
Fortunately, I have a VMWare snaphost before I upgraded my secndary Nagios server to 4.2.4 from 4.1.1.
Now both servers has same nagios server, the rsync of /var/nagios/retention.dat from primary to secondary still doesn't work.
What I tried was to

Code: Select all

1.  stop nagios on secondary.
2.  delete secondary:/var/nagios/retention.dat and rsync over new copy of retention.dat from primary.
3.  start nagios on secondary
Then I can see the new added downtime were propergated to secondary nagios.
This also works for 4.1.1 to 4.2.2 servers.

Looks like stop secondary nagios server before retentin.da got copied over is important.
Do you know why ?

Re: rsync downtime between Primary and Secondary Nagios serv

Posted: Fri Feb 17, 2017 9:34 am
by tgriep
I would guess that the secondary server when it is still running, overwrites the rsynced copy of the file so stopping the daemon would have to be added to the process you use to copy the dat files.

Re: rsync downtime between Primary and Secondary Nagios serv

Posted: Fri Feb 17, 2017 2:00 pm
by nagmoto
pls lock this thread.