Acknowledgements missing

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Acknowledgements missing

Post by rajasegar »

Yesterday we got complain from users that all their acknowledgements went missing.
All the alert history is still there.

Please advice on this problem and how to restore the acknowledgments.

Thanks.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
cmerchant
Posts: 546
Joined: Wed Sep 24, 2014 11:19 am

Re: Acknowledgements missing

Post by cmerchant »

Acknowledgements are stored in the status.dat and retention.dat file.

You should check your retention settings and also do you have a ram disk configured? Was nagios restarted? Was there a re-booting the system?

I think rather than duplicate post everything here, I should point you to a forum thread post here:

http://support.nagios.com/forum/viewtop ... =6&t=26166

and another forum thread here:

http://support.nagios.com/forum/viewtop ... =16&t=8266
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Acknowledgements missing

Post by abrist »

Acknowledgements are stored in the retention.dat file.
Was this file removed at some point?
Is state retention currently enabled?

Code: Select all

grep "retention\|retain_state" /usr/local/nagios/etc/nagios.cfg
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Acknowledgements missing

Post by rajasegar »

abrist wrote:Acknowledgements are stored in the retention.dat file.
Was this file removed at some point?
Is state retention currently enabled?

Code: Select all

grep "retention\|retain_state" /usr/local/nagios/etc/nagios.cfg
Nothing was removed. We did move the following to ramdisk as per your pdf instructions.

Code: Select all

[nagios@nagiosprodxi1 nagiosramdisk]$ grep "retention\|retain_state" /usr/local/nagios/etc/nagios.cfg
retain_state_information=1
retention_update_interval=60
state_retention_file=/usr/local/nagios/var/retention.dat
[nagios@nagiosprodxi1 nagiosramdisk]$

These are in the ramdisk which has 1Gig capacity

Code: Select all

[nagios@nagiosprodxi1 nagiosramdisk]$ ls -l
total 44468
-rw-r--r-- 1 nagios nagios 18443323 Mar 20 07:03 objects.cache
-rw-r--r-- 1 nagios nagios 26913292 Mar 20 07:22 status.dat
drwxrwxr-x 2 nagios nagios       40 Mar 20 07:03 tmp
[nagios@nagiosprodxi1 nagiosramdisk]$
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Acknowledgements missing

Post by ssax »

With the alerts that have been acknowledged and are missing from the acknowledgements if you look at the alerts do they say they are being handled?

Post the output of the following:

Code: Select all

ls -lh /usr/local/nagios/var/retention.dat
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Acknowledgements missing

Post by rajasegar »

ssax wrote:With the alerts that have been acknowledged and are missing from the acknowledgements if you look at the alerts do they say they are being handled?

Post the output of the following:

Code: Select all

ls -lh /usr/local/nagios/var/retention.dat
No it shows up under unhandled. All the comments etc are missing.

Code: Select all

[nagios@nagiosprodxi1 perfdata]$ ls -lh /usr/local/nagios/var/retention.dat
-rw------- 1 nagios nagios 27M Mar 23 13:09 /usr/local/nagios/var/retention.dat
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Acknowledgements missing

Post by ssax »

If you have a backup of the file you could make a backup of the current file, stop the nagios service, overwrite retention.dat, and start the nagios service but you would lose any alerts/acks that you would have done since then.

I'm not sure how they went missing but you would most likely need to have them reacknowledge them in order to get them back if you don't have a backup or don't want to lose anything from your last backup until now.

You could compress and PM me the retention.dat file but they are most likely missing from there but I will take a look if you would like.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Acknowledgements missing

Post by rajasegar »

ssax wrote:If you have a backup of the file you could make a backup of the current file, stop the nagios service, overwrite retention.dat, and start the nagios service but you would lose any alerts/acks that you would have done since then.

I'm not sure how they went missing but you would most likely need to have them reacknowledge them in order to get them back if you don't have a backup or don't want to lose anything from your last backup until now.

You could compress and PM me the retention.dat file but they are most likely missing from there but I will take a look if you would like.
This is the second time this has happened.
The funny thing is I could see all the acknowledgements in the retention.dat after even after they went missing from the UI.
So this looks like a Nagios bug.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Acknowledgements missing

Post by tgriep »

Can you run this and post back the output?

Code: Select all

tail -100 /var/log/mysqld.log
Be sure to check out our Knowledgebase for helpful articles and solutions!
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Acknowledgements missing

Post by rajasegar »

tgriep wrote:Can you run this and post back the output?

Code: Select all

tail -100 /var/log/mysqld.log

Code: Select all

[my@nagiosproddb1 ~]$ sudo tail -100 /var/log/mysqld.log
140819  7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819  7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819  7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819  7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819  7:42:46 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
140819  7:42:46 [Note] /usr/libexec/mysqld: Normal shutdown

140819  7:42:46 [Note] Event Scheduler: Purging the queue. 0 events
140819  7:42:48  InnoDB: Starting shutdown...
140819  7:42:50  InnoDB: Shutdown completed; log sequence number 0 44253
140819  7:42:50 [Note] /usr/libexec/mysqld: Shutdown complete

140819 07:42:50 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
140819 07:47:44 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140819  7:47:44  InnoDB: Initializing buffer pool, size = 8.0M
140819  7:47:44  InnoDB: Completed initialization of buffer pool
140819  7:47:44  InnoDB: Started; log sequence number 0 44253
140819  7:47:44 [Note] Event Scheduler: Loaded 0 events
140819  7:47:44 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
140819  7:47:52 [Note] /usr/libexec/mysqld: Normal shutdown

140819  7:47:52 [Note] Event Scheduler: Purging the queue. 0 events
140819  7:47:54  InnoDB: Starting shutdown...
140819  7:47:54  InnoDB: Shutdown completed; log sequence number 0 44253
140819  7:47:54 [Note] /usr/libexec/mysqld: Shutdown complete

140819 07:47:54 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
140819 07:47:55 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140819  7:47:55  InnoDB: Initializing buffer pool, size = 8.0M
140819  7:47:55  InnoDB: Completed initialization of buffer pool
140819  7:47:55  InnoDB: Started; log sequence number 0 44253
140819  7:47:55 [Note] Event Scheduler: Loaded 0 events
140819  7:47:55 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
141023 15:10:15 [Note] /usr/libexec/mysqld: Normal shutdown

141023 15:10:15 [Note] Event Scheduler: Purging the queue. 0 events
141023 15:10:15  InnoDB: Starting shutdown...
141023 15:10:18  InnoDB: Shutdown completed; log sequence number 0 44253
141023 15:10:18 [Note] /usr/libexec/mysqld: Shutdown complete

141023 15:10:18 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
141023 17:47:24 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141023 17:47:24  InnoDB: Initializing buffer pool, size = 8.0M
141023 17:47:24  InnoDB: Completed initialization of buffer pool
141023 17:47:24  InnoDB: Started; log sequence number 0 44253
141023 17:47:24 [Note] Event Scheduler: Loaded 0 events
141023 17:47:24 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
141024  7:36:31 [Note] /usr/libexec/mysqld: Normal shutdown

141024  7:36:31 [Note] Event Scheduler: Purging the queue. 0 events
141024  7:36:31  InnoDB: Starting shutdown...
141024  7:36:35  InnoDB: Shutdown completed; log sequence number 0 44253
141024  7:36:35 [Note] /usr/libexec/mysqld: Shutdown complete

141024 07:36:35 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
141024 07:39:47 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141024  7:39:47  InnoDB: Initializing buffer pool, size = 8.0M
141024  7:39:47  InnoDB: Completed initialization of buffer pool
141024  7:39:47  InnoDB: Started; log sequence number 0 44253
141024  7:39:47 [Note] Event Scheduler: Loaded 0 events
141024  7:39:47 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
141027 17:31:58 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
141027 17:31:58  InnoDB: Initializing buffer pool, size = 8.0M
141027 17:31:58  InnoDB: Completed initialization of buffer pool
141027 17:31:59  InnoDB: Started; log sequence number 0 44253
141027 17:31:59 [Note] Event Scheduler: Loaded 0 events
141027 17:31:59 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_contactnotificationmethods' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_eventhandlers' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_hoststatus' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_logentries' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_notifications' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_servicestatus' is marked as crashed and should be repaired
141027 17:43:45 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_statehistory' is marked as crashed and should be repaired
141027 17:43:47 [Note] Found 353449 of 353463 rows when repairing './nagios/nagios_contactnotificationmethods'
141027 17:43:47 [Note] Found 55 of 57 rows when repairing './nagios/nagios_eventhandlers'
141027 17:44:26 [Note] Found 3434845 of 3434836 rows when repairing './nagios/nagios_logentries'
141027 17:46:42 [Note] Found 13086583 of 13089236 rows when repairing './nagios/nagios_notifications'
141027 17:46:43 [Note] Found 11371 of 11414 rows when repairing './nagios/nagios_servicestatus'
141027 17:46:47 [Note] Found 793650 of 793649 rows when repairing './nagios/nagios_statehistory'
150313 19:03:57 [Note] /usr/libexec/mysqld: Normal shutdown

150313 19:03:57 [Note] Event Scheduler: Purging the queue. 0 events
150313 19:03:57  InnoDB: Starting shutdown...
150313 19:04:01  InnoDB: Shutdown completed; log sequence number 0 44253
150313 19:04:01 [Note] /usr/libexec/mysqld: Shutdown complete

150313 19:04:01 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
150313 19:10:01 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150313 19:10:01  InnoDB: Initializing buffer pool, size = 8.0M
150313 19:10:01  InnoDB: Completed initialization of buffer pool
150313 19:10:01  InnoDB: Started; log sequence number 0 44253
150313 19:10:01 [Note] Event Scheduler: Loaded 0 events
150313 19:10:01 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
Locked