Page 1 of 3

Config Error after rename

Posted: Thu Jul 09, 2015 9:51 am
by notverynick
Hi Guys,

The following events occurred on our NagiosXI box and we are now stuck with the inability to change a hosts name and also have lost it's status history.

1) Everything Ok
2) Manual rename which did not work, the web 'apply config' claimed a host of the old name was still present. CLI verify however was fine, but there was still a reference to the old name.
3) Rolled back
4) Applied enterprise license
5) used renaming tool
6) changed host IP (which was part of the rename)
7) config won't apply, same ghost name appearing
8) rolled back
9) changed IP (so we have the check working but not with the correct host name)
10) host status history lost as I believe the rollback from the DB side did not take into account the renaming tools action, therefore the status history will be tied to the now intended host name.

Please let me know what I should do to resolve?

Thanks

Nick

Re: Config Error after rename

Posted: Thu Jul 09, 2015 10:01 am
by jdalrymple
There is a bug in older versions where changing a host's name if it is the parent of anything - the apply config fails.
Is the host you're changing the name of indeed the parent to any other host(s)? If so would it be terribly difficult to temporarily make it NOT the parent of said host(s).

A delete/write/verify/restart may also be a valid workaround - but I don't recall for sure.

Re: Config Error after rename

Posted: Thu Jul 09, 2015 10:04 am
by notverynick
Thanks for the quick response!

I do not believe that this host is a parent, however as there is no 'host children' button in the CCM I'm not sure how I can check this easily.

Could you suggest a method?

Re: Config Error after rename

Posted: Thu Jul 09, 2015 10:10 am
by jdalrymple
From "Home" click on "Network Status Map" in the left-pane.

Parents are closer to Nagios Process, children are farther away (duh) :)

Re: Config Error after rename

Posted: Thu Jul 09, 2015 10:11 am
by lmiltchev
What is the Nagios XI version that you are currently using? Have you tried running the Write Config Tool in the following order:

CCM->Tools->Write Config Files->Delete->Write->Verify->Apply Configuration (if there are no errors in the previous two steps)?

Re: Config Error after rename

Posted: Thu Jul 09, 2015 10:49 am
by notverynick
LOL, how simple. Sadly the network status map seems to scale now and there is no zoom function, before you had to scroll all around it! So I can see everything but it's so small that I can't see any detail. Also tried the hyper map but that didn't help either, nor did nagvis

I tried the write/delete/write/verify/apply which worked, but then making the name change again failed. Please see attached from re-naming tool and then subsequent 'apply config' output, both done after the suggested steps.

we are running Nagios XI 2014R1.3
Screen Shot 2015-07-09 at 16.39.58.png
Note that we tried to rename 'Ricoh Large BS3' to 'Ricoh Large 15GS'

Error: Could not find any host matching 'Ricoh Large BS3' (config file '/usr/local/nagios/etc/services/Printer Status.cfg', starting on line 14)
Error: Failed to expand host list '15GSRicoh1' for service 'Printer Status' (/usr/local/nagios/etc/services/Printer Status.cfg:14)

The service check in the error is attached to a host group which contains all our Ricoh printers, drilling down into this we see the host correctly renamed in the CCM

Thanks

Re: Config Error after rename

Posted: Thu Jul 09, 2015 11:37 am
by abrist
notverynick wrote:we are running Nagios XI 2014R1.3
og. Is updating a possibility? Some the CCM code changes I made in the early versions of 2014 are . . . . not so great. Many edge cases were missed.

Re: Config Error after rename

Posted: Fri Jul 10, 2015 2:35 pm
by notverynick
Yeah, I don't think that should be a problem, I expect I'll wait until Monday now though so I'll post back results then.

Thanks and have a great weekend!

Re: Config Error after rename

Posted: Mon Jul 13, 2015 8:29 am
by tgriep
Let us know how the upgrade works for you.

Re: Config Error after rename

Posted: Thu Jul 16, 2015 11:49 am
by notverynick
Hi Guys,

Just picking this up due to a hectic couple of days.

I'm getting this in the backup script before the update:

mysqldump: Error: 'Got error 28 from storage engine' when trying to dump tablespaces

Which is no disk space on either / or /tmp or so google tells me. Can this be resolved as I want a clean backup before I proceed with the upgrade?

Also to add the backup script has now seemingly locked up with this error, I've never seen it go this long before and I've upgraded numerous times.

So I rebooted the box, the script had been left running for some time. Afer that I re ran and got:

mysqldump: Got error: 2002: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) when trying to connect
Error backing up MySQL database 'nagios' - check the password in this script!


Really keen to get this all resolved and slightly worried that the DB is now in a state...

Thanks!