Re: [Nagios-devel] [RFC] - NDOUtils duplicate lines problem

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] [RFC] - NDOUtils duplicate lines problem

Post by Guest »

Hi,

Hendrik Baecker wrote:
> a few days ago Øyvind Nordang, Julien Mathis and Michael Friedrich,
> discussed about duplicate rows in the servicecheck table.
> Michael dropped me a few lines off-list showing up the problem exists
> in some more tables.
I wrote that here too but noone replied on that. I did a little research
where the faulty patch was coming from but I could only see the initial
commit by Ethan on that.
>
> The underlying problem:
> MySQL queries like
> "insert into... on duplicate key update..."
> needs primary and/or unique keys to be hit by the query to trigger the
> UPDATE instead of the INSERT.
> Most of the relevant keys for those queries are actually primary nor
> unique keys, they are just KEYs.
Jep. Just as the MySQL manual points out.
>
> So we have(98%) / might have (2%) duplicate entries in the following
> tables:
>
> nagios_servicechecks
> nagios_systemcommands
> nagios_timedeventqueue
> nagios_timedevents
Those users who are using the 2 years old 1.4b7 are not affected - the
commit was after that release. So if they read this, do not upgrade and
wait for the next version (or do not apply the unique key changes).
>
> Now to the problem from the point of development: I see no good
> solution to handle this.
Well regarding the fact that NDOUtils doesn't seem to come out of beta,
you might just change the db scheme using the unique keys again.
>
> One part of the solution is to change the keys to define the mandatory
> UNIQUE keys, but this can't be done cleanly on existing data since you
> can't define some UNIQUE keys on rows that would be duplicates.
>
> Solutions to be discussed:
>
> 1. Big-Bang-Solution
> Define the user have to truncate above tables to be able to set the
> right keys.
> This will result in data loss.
You might be able to write a query which extracts duplicated rows on the
unique constraints you want to redefine (group by, order by) - and then
redo the 2x on duplicate key insert or update statements. But this is
only a solution for people who need exactly those tables for reporting.
But if you want to learn procedures this would be the right place for
starting correcting this *pardon* big fuckup.
>
> 2.
> It might be possible to write some nifty upgrade scripts to dump the
> data out of the database, set the unique keys and try to re-import the
> data with some fault tolerance.
> Anyone out there with some experience on this level?
> Contra:
> Depending on the amount of data, such a job might be: a long runner
> (time); complicated; not realy error resistant...
I would not recommend that on a production server since the table will
be locked then and ndo2db blocks and Nagios becomes a dead daemon from
the blocking neb module.
>
> What thinks this community about the cost-benefit ratio?
Regarding the blocking NEB module i would recommend data backup and
truncate table.
>
> IMO:
> nagios_timedevent* table are not so important and data loss is irrelevant.
> nagios_systemcommands might be just more interesting for someone
> but nagios_servicechecks seems to me like a treasure of gold, data
> loss in here might be realy painful
systemcommands are issued e.g. if perfdata is enabled. servicechecks
need to be re-organised in case. but using 2 backups - one to work on
and the other as is.

>
> Any comments?
> May be someone is able to adress some voodoo data repair query/script
> to eliminate those duplicates before key changing?
Some kind of the following script (best practical would be a perl script)

0. create a new table after servicechecks scheme but apply the drop
key/add unique key statement
1. query to get all the data: select * from icinga_servicechecks order
by instance_id, service_object_id, start_time, start_time_usec;
2. step through and fetch odd and even results, check if constraints
from above are the same. if not continue;
3. all data from the first row should be kept, only the ones to be
updated are:
check_type, current_check_attempt, max_check_attempts, state,
state_type, start_time, start_time_usec, end_time, end_time_usec,
timeout='%d', early_timeout, execut

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked