Page 7 of 9
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Thu Jul 12, 2018 10:54 am
by scottwilkerson
This looks correct.
One thing I cannot be certain of is if there are multiple KoboFS2.kobo.corp definitions, also
Code: Select all
grep -R KoboFS2.kobo.corp /usr/local/nagios/etc/
grep -R linux-servers /usr/local/nagios/etc/
also
Code: Select all
grep -R generic-host /usr/local/nagios/etc/|grep -v use
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Thu Jul 12, 2018 11:17 am
by mtripodi
Ok then i'm out of ideas. The obvious issue is that my host definitions are not seeing both the generic-host and generic-service templates which contain the max_check_attempts value. I have no idea why! Confirmed Nagios is looking at this directory: /usr/local/nagios/etc/objects/templates
I have attached screenshots of the output from the commands you suggested. Looks like generic-host template is only associated with 2 directories, the original templates.cfg I renamed to templates.cfg.default so Nagios would not look at it and only look at the above (my directory).
I'm not sure why it's complaining when everything is there and defined. I need to proceed with my testing ASAP.
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Thu Jul 12, 2018 3:21 pm
by scottwilkerson
If you want to zip up your whole /usr/local/nagios/etc directory and PM it to me I am willing to test veritying the config on my system
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Thu Jul 12, 2018 3:41 pm
by mtripodi
Sure, see attached zip file of my /etc directory. To note, I ended up manually adding the max_check_attempts value to each host definition. Didn't take that long just tedious as hell.
Still experiencing error " could not add object property" for hostgroups.cfg file. Please help with that.
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Thu Jul 12, 2018 4:13 pm
by scottwilkerson
remove this from
/usr/local/nagios/etc/objects/localhost.cfg
Code: Select all
# Define an optional hostgroup for Linux machines
define hostgroup {
hostgroup_name linux-servers ; The name of the hostgroup
alias Linux Servers ; Long name of the group
members localhost ; Comma separated list of hosts that belong to this group
}
also, these commands are duplicated in commands.cfg, so
Code: Select all
rm -f /usr/local/nagios/etc/objects/commands/command_notify.cfg
In nagios.cfg comment out these lines
Code: Select all
cfg_file=/usr/local/nagios/etc/objects/templates/generic-host.cfg
cfg_file=/usr/local/nagios/etc/objects/templates/generic-service.cfg
to
Code: Select all
#cfg_file=/usr/local/nagios/etc/objects/templates/generic-host.cfg
#cfg_file=/usr/local/nagios/etc/objects/templates/generic-service.cfg
because you are already loading them in this directive
Code: Select all
cfg_dir=/usr/local/nagios/etc/objects/templates
Then this definition in /usr/local/nagios/etc/objects/services/services.cfg has no hosts assigned so it needs to be removed
Code: Select all
#Special check for Content Ops Volume Reporting. This checks a scheduled task running on Kobofs1 once a day, starting at 9:10AM, for 10 minutes.
#If, at any time during that 10 minute window, the scheduled task completes with an error code other than 0, the Content OPS team is notified.
define service {
service_description New Volume Report Scheduled Task
use generic-service
#Conent Ops only needs this service to check once a day, just after the scheduled task is configured to run on kobofs1
check_period 9_10AM_for_10_min
#Since we will only have a 10 minute window to test this service each day, we will test the service more rapidly than normal.
check_interval 2
#Since this check will only run once a day, and since fixing the root cause does not clear the alert in nagios until the next day, we dont report recovery.
notification_options w,u,c
contact_groups contentoperations,admins
# hostgroup_name content_ops_scheduled_tasks
check_command check_nrpe!CheckTaskSched!"filter=title eq 'new volume report' AND exit_code ne 0" "syntax='%title%': Task completed with Exit Code %exit_code%. Please check the VolumeReports on TS3!" MaxCrit=1
}
Add something this to /usr/local/nagios/etc/objects/commands.cfg
Code: Select all
define command {
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 30 -c $ARG1$ $ARG2$
}
You will need to repead this for all your other missing commands like
snmp_windows_dhcp_avalaible_addresses
check_netapp_vol_cap
check_nrpe_1arg
etc...
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Thu Jul 12, 2018 6:13 pm
by mtripodi
I made all the changes you suggested, except adding the additional commands. Just added the one you specified in commands.cfg.
I then ran the config test after making all the changes and now see 28 errors! Most appear to be missing service check commands and a host check command. I'm unable to see all the errors as I can't scroll up the window further or export.
I'm not sure what change caused all these errors but we seem to have backtracked instead of making progress
Please HELP! I have attached a screenshot.
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Fri Jul 13, 2018 1:46 pm
by mtripodi
Please assist with suggestions on how to correct the configuration and what we did to cause these errors. I had one error now have 28 after performing those changes you mentioned. Please help!
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Fri Jul 13, 2018 2:24 pm
by jomann
It looks like pretty much all of the errors you have now come from missing commands. Do you have a commands.cfg file with all those commands in it that you can copy those commands out of into your new one? If you copy the ones you can see now, you can then see the rest of the output.
You can also scroll over that by doing this after you run the Nagios Core config verification:
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Wed Jul 18, 2018 2:44 pm
by mtripodi
I was able to locate the missing commands it was asking for and copy them over to the test server. I currently have zero warnings and errors!
There are couple things left such as, LDAP authentication setup, email (postfix). It did give me an errors for the following configuration file advising "Extinfo definitions will no longer be defined this way in future versions". I"m assuming this will continue to function the way I have it configured for now, however are showing up but not rendering in the web console. See 2 attachments.
Any suggestions where to put this configuration instead of a separate file?
Re: Nagios Core 3.5.1 to 4.3.4. Upgrade
Posted: Thu Jul 19, 2018 8:20 am
by scottwilkerson
The Extinfo files are no longer used in Core 4+