Page 1 of 3
Acknowledgements missing
Posted: Wed Mar 18, 2015 10:54 pm
by rajasegar
Yesterday we got complain from users that all their acknowledgements went missing.
All the alert history is still there.
Please advice on this problem and how to restore the acknowledgments.
Thanks.
Re: Acknowledgements missing
Posted: Thu Mar 19, 2015 10:02 am
by cmerchant
Acknowledgements are stored in the status.dat and retention.dat file.
You should check your retention settings and also do you have a ram disk configured? Was nagios restarted? Was there a re-booting the system?
I think rather than duplicate post everything here, I should point you to a forum thread post here:
http://support.nagios.com/forum/viewtop ... =6&t=26166
and another forum thread here:
http://support.nagios.com/forum/viewtop ... =16&t=8266
Re: Acknowledgements missing
Posted: Thu Mar 19, 2015 10:04 am
by abrist
Acknowledgements are stored in the retention.dat file.
Was this file removed at some point?
Is state retention currently enabled?
Code: Select all
grep "retention\|retain_state" /usr/local/nagios/etc/nagios.cfg
Re: Acknowledgements missing
Posted: Thu Mar 19, 2015 6:40 pm
by rajasegar
abrist wrote:Acknowledgements are stored in the retention.dat file.
Was this file removed at some point?
Is state retention currently enabled?
Code: Select all
grep "retention\|retain_state" /usr/local/nagios/etc/nagios.cfg
Nothing was removed. We did move the following to ramdisk as per your pdf instructions.
Code: Select all
[nagios@nagiosprodxi1 nagiosramdisk]$ grep "retention\|retain_state" /usr/local/nagios/etc/nagios.cfg
retain_state_information=1
retention_update_interval=60
state_retention_file=/usr/local/nagios/var/retention.dat
[nagios@nagiosprodxi1 nagiosramdisk]$
These are in the ramdisk which has 1Gig capacity
Code: Select all
[nagios@nagiosprodxi1 nagiosramdisk]$ ls -l
total 44468
-rw-r--r-- 1 nagios nagios 18443323 Mar 20 07:03 objects.cache
-rw-r--r-- 1 nagios nagios 26913292 Mar 20 07:22 status.dat
drwxrwxr-x 2 nagios nagios 40 Mar 20 07:03 tmp
[nagios@nagiosprodxi1 nagiosramdisk]$
Re: Acknowledgements missing
Posted: Fri Mar 20, 2015 11:47 am
by ssax
With the alerts that have been acknowledged and are missing from the acknowledgements if you look at the alerts do they say they are being handled?
Post the output of the following:
Code: Select all
ls -lh /usr/local/nagios/var/retention.dat
Re: Acknowledgements missing
Posted: Mon Mar 23, 2015 12:41 am
by rajasegar
ssax wrote:With the alerts that have been acknowledged and are missing from the acknowledgements if you look at the alerts do they say they are being handled?
Post the output of the following:
Code: Select all
ls -lh /usr/local/nagios/var/retention.dat
No it shows up under unhandled. All the comments etc are missing.
Code: Select all
[nagios@nagiosprodxi1 perfdata]$ ls -lh /usr/local/nagios/var/retention.dat
-rw------- 1 nagios nagios 27M Mar 23 13:09 /usr/local/nagios/var/retention.dat
Re: Acknowledgements missing
Posted: Mon Mar 23, 2015 10:06 am
by ssax
If you have a backup of the file you could make a backup of the current file, stop the nagios service, overwrite retention.dat, and start the nagios service but you would lose any alerts/acks that you would have done since then.
I'm not sure how they went missing but you would most likely need to have them reacknowledge them in order to get them back if you don't have a backup or don't want to lose anything from your last backup until now.
You could compress and PM me the retention.dat file but they are most likely missing from there but I will take a look if you would like.
Re: Acknowledgements missing
Posted: Mon Mar 23, 2015 6:00 pm
by rajasegar
ssax wrote:If you have a backup of the file you could make a backup of the current file, stop the nagios service, overwrite retention.dat, and start the nagios service but you would lose any alerts/acks that you would have done since then.
I'm not sure how they went missing but you would most likely need to have them reacknowledge them in order to get them back if you don't have a backup or don't want to lose anything from your last backup until now.
You could compress and PM me the retention.dat file but they are most likely missing from there but I will take a look if you would like.
This is the second time this has happened.
The funny thing is I could see all the acknowledgements in the retention.dat after even after they went missing from the UI.
So this looks like a Nagios bug.
Re: Acknowledgements missing
Posted: Tue Mar 24, 2015 11:24 am
by tgriep
Can you run this and post back the output?
Re: Acknowledgements missing
Posted: Wed Mar 25, 2015 6:18 pm
by rajasegar
tgriep wrote:Can you run this and post back the output?
Code: Select all
[my@nagiosproddb1 ~]$ sudo tail -100 /var/log/mysqld.log
140819 7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819 7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819 7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819 7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819 7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819 7:42:46 [Note] /usr/libexec/mysqld: Normal shutdown
140819 7:42:46 [Note] Event Scheduler: Purging the queue. 0 events
140819 7:42:48 InnoDB: Starting shutdown...
140819 7:42:50 InnoDB: Shutdown completed; log sequence number 0 44253
140819 7:42:50 [Note] /usr/libexec/mysqld: Shutdown complete
140819 07:42:50 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
140819 07:47:44 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140819 7:47:44 InnoDB: Initializing buffer pool, size = 8.0M
140819 7:47:44 InnoDB: Completed initialization of buffer pool
140819 7:47:44 InnoDB: Started; log sequence number 0 44253
140819 7:47:44 [Note] Event Scheduler: Loaded 0 events
140819 7:47:44 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
140819 7:47:52 [Note] /usr/libexec/mysqld: Normal shutdown
140819 7:47:52 [Note] Event Scheduler: Purging the queue. 0 events
140819 7:47:54 InnoDB: Starting shutdown...
140819 7:47:54 InnoDB: Shutdown completed; log sequence number 0 44253
140819 7:47:54 [Note] /usr/libexec/mysqld: Shutdown complete
140819 07:47:54 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
140819 07:47:55 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140819 7:47:55 InnoDB: Initializing buffer pool, size = 8.0M
140819 7:47:55 InnoDB: Completed initialization of buffer pool
140819 7:47:55 InnoDB: Started; log sequence number 0 44253
140819 7:47:55 [Note] Event Scheduler: Loaded 0 events
140819 7:47:55 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
141023 15:10:15 [Note] /usr/libexec/mysqld: Normal shutdown
141023 15:10:15 [Note] Event Scheduler: Purging the queue. 0 events
141023 15:10:15 InnoDB: Starting shutdown...
141023 15:10:18 InnoDB: Shutdown completed; log sequence number 0 44253
141023 15:10:18 [Note] /usr/libexec/mysqld: Shutdown complete
141023 15:10:18 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
141023 17:47:24 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141023 17:47:24 InnoDB: Initializing buffer pool, size = 8.0M
141023 17:47:24 InnoDB: Completed initialization of buffer pool
141023 17:47:24 InnoDB: Started; log sequence number 0 44253
141023 17:47:24 [Note] Event Scheduler: Loaded 0 events
141023 17:47:24 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
141024 7:36:31 [Note] /usr/libexec/mysqld: Normal shutdown
141024 7:36:31 [Note] Event Scheduler: Purging the queue. 0 events
141024 7:36:31 InnoDB: Starting shutdown...
141024 7:36:35 InnoDB: Shutdown completed; log sequence number 0 44253
141024 7:36:35 [Note] /usr/libexec/mysqld: Shutdown complete
141024 07:36:35 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
141024 07:39:47 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141024 7:39:47 InnoDB: Initializing buffer pool, size = 8.0M
141024 7:39:47 InnoDB: Completed initialization of buffer pool
141024 7:39:47 InnoDB: Started; log sequence number 0 44253
141024 7:39:47 [Note] Event Scheduler: Loaded 0 events
141024 7:39:47 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
141027 17:31:58 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141027 17:31:58 InnoDB: Initializing buffer pool, size = 8.0M
141027 17:31:58 InnoDB: Completed initialization of buffer pool
141027 17:31:59 InnoDB: Started; log sequence number 0 44253
141027 17:31:59 [Note] Event Scheduler: Loaded 0 events
141027 17:31:59 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_contactnotificationmethods' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_eventhandlers' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_logentries' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_notifications' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_statehistory' is marked as crashed and should be repaired
141027 17:43:47 [Note] Found 353449 of 353463 rows when repairing './nagios/nagios_contactnotificationmethods'
141027 17:43:47 [Note] Found 55 of 57 rows when repairing './nagios/nagios_eventhandlers'
141027 17:44:26 [Note] Found 3434845 of 3434836 rows when repairing './nagios/nagios_logentries'
141027 17:46:42 [Note] Found 13086583 of 13089236 rows when repairing './nagios/nagios_notifications'
141027 17:46:43 [Note] Found 11371 of 11414 rows when repairing './nagios/nagios_servicestatus'
141027 17:46:47 [Note] Found 793650 of 793649 rows when repairing './nagios/nagios_statehistory'
150313 19:03:57 [Note] /usr/libexec/mysqld: Normal shutdown
150313 19:03:57 [Note] Event Scheduler: Purging the queue. 0 events
150313 19:03:57 InnoDB: Starting shutdown...
150313 19:04:01 InnoDB: Shutdown completed; log sequence number 0 44253
150313 19:04:01 [Note] /usr/libexec/mysqld: Shutdown complete
150313 19:04:01 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
150313 19:10:01 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150313 19:10:01 InnoDB: Initializing buffer pool, size = 8.0M
150313 19:10:01 InnoDB: Completed initialization of buffer pool
150313 19:10:01 InnoDB: Started; log sequence number 0 44253
150313 19:10:01 [Note] Event Scheduler: Loaded 0 events
150313 19:10:01 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution