The short story is that we were restoring a configuration snapshot made on a pre-upgrade version, so the database schema was outdated (it lacked the "exclude" field in tbl_lnkServiceToHost table), the effect was that the "apply configuration" just skipped the host_name field in writing down the services file.
Isn't there a db consistency check that catch these situations before applying a configuration? If not, it could really save some headaches.
The long story: why were we restoring a snapshot made in a previous version? Because we updated nagios XI (manually), everithing seems to be ok but, again, "apply configuration" was giving a similar but different error:
Template '' specified in service definition could not be not found (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting on line 620)
Template '' specified in service definition could not be not found (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting on line 628)
Template '' specified in service definition could not be not found (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting on line 1154)
And /usr/local/nagios/etc/servicetemplates.cfg had something similar to:
define service {
name xiwizard_oracleserverspace_service
use
check_command check_xi_oracleserverspace
register 0
}
note the "use" field empty, this was not only in the lines reported by the errors but in almost all the services.
After deep debug session we recognize that in the long past we rename the service called "xiwizard_generic_service" (the one usally with id 2) into something else (e.g. "TMP_srv_gen_noH24"), and the update scripts for some reason didn't like the new name (again, with a somewhat cryptic message).
So we restored the server as before the nagios XI update (we had a virtual machine snapshot), changed the name of service template "TMP_srv_gen_noH24" back to "xiwizard_generic_service" and applied again the update procedure, this time everything went fine.
The question now is: is changing default service names forbidden? Or is it a fault of the update scripts?
> When you say "we" are you connected with @riccardo.spisa?
yes, He's a collegue.
> How are you changing the default service names?
We had only changed the service template named "xiwizard_generic_service" (the one that come with the default installation, with id 2) to the new name "TMP_srv_gen_noH24" (a long time ago, well before the upgrade attempt), via CCM -> service tempates -> selected the template with id 2 -> template name.
> Are you using the bulk renaming tool? Located at --> XI home --> Configure --> Core Config Manager --> Under tools --> Bulk Renaming tool?
No, the bulk renaming tool was not used, we just changed back the name of the service template with id 2 to "xiwizard_generic_service" and the upgrade completed just fine.
If the service template with id 2 had the custom name "TMP_srv_gen_noH24", the update stopped with the errors (Template '' specified in service definition could not be not found (config file '/usr/local/nagios/etc/servicetemplates.cfg', starting on line xxx).
Note that many other service templates use the service template with id 2 (they have it in Manage Tempates -> assigned).