Nagios XI UI showing database error after server reboot

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
sitaonair
Posts: 55
Joined: Wed Jan 06, 2016 3:36 am

Nagios XI UI showing database error after server reboot

Post by sitaonair »

Hi,

We had a reboot on the host Nagios XI was running from to take some backups and upon startup, the Web UI page stated an error with database and suggested the following script to be run:

Code: Select all

/usr/local/nagiosxi/scripts/repair_databases.sh
After running the script the Web UI still did not load and from the nagios.logs multiple entries these were observed:

Code: Select all

2018-12-13T09:39:17.696504+00:00 nagxi-01 nagios: wproc: GLOBAL SERVICE EVENTHANDLER job 184 from worker Core Worker 15784 is a non-check helper but exited with return code 1
2018-12-13T09:39:17.696528+00:00 nagxi-01 nagios: wproc:   early_timeout=0; exited_ok=1; wait_status=256; error_code=0;
2018-12-13T09:39:17.696533+00:00 nagxi-01 nagios: wproc:   stdout line 01: UNABLE TO CONNECT TO DB - EXITING!
mysqld.log looks fine and I was able to log in to the mysql DB from console

Code: Select all

181213 09:30:28 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
181213 09:32:49 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
181213  9:32:49  InnoDB: Initializing buffer pool, size = 8.0M
181213  9:32:49  InnoDB: Completed initialization of buffer pool
181213  9:32:49  InnoDB: Started; log sequence number 0 44243
181213  9:32:49 [Note] Event Scheduler: Loaded 0 events
181213  9:32:49 [Note] /usr/libexec/mysqld: ready for connections.
I am running Nagios XI 5.4.11, I assumed that the Nagios was running off mysql, but the issue was resolved after the following was run

Code: Select all

service postgresql start
As I inherited this Nagios setup, how can I check if it was upgraded from a version before 5 which might explain why there is a need for postgresql to be started. Lastly how can I check if this postgresql is indeed still in use and when i check the service status the below is returned.

Code: Select all

[root@nagxi-01 mysql]# service postgresql status
postmaster dead but pid file exists
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios XI UI showing database error after server reboot

Post by scottwilkerson »

you can check if postgresql is still in use by running the following command

Code: Select all

grep pgsql /usr/local/nagiosxi/html/config.inc.php
if you get returned

Code: Select all

        "dbtype" => 'pgsql',
It's possible you are getting that error is this is a centos or RHEL 7 machine wher eyou may get the proper result running

Code: Select all

systemctl status postgresql
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
sitaonair
Posts: 55
Joined: Wed Jan 06, 2016 3:36 am

Re: Nagios XI UI showing database error after server reboot

Post by sitaonair »

Hi,

Code: Select all

[nagios@nagxi-01 custom]$ grep pgsql /usr/local/nagiosxi/html/config.inc.php
        "dbtype" => 'pgsql',
[nagios@nagxi-01 custom]$ cat /etc/redhat-release
CentOS release 6.7 (Final)
From the output it seems that pgsql is still in use for the setup, is it right to assume that the Nagios version was upgraded from 4.x?

Code: Select all

// db-specific connection information
$cfg['db_info'] = array(
    "nagiosxi" => array(
        "dbtype" => 'pgsql',
        "dbserver" => '',
        "user" => 'nagiosxi',
        "pwd" => '',
        "db" => 'nagiosxi',
        "dbmaint" => array( // variables affecting maintenance of db
            "max_auditlog_age" => 30, // max time (in DAYS) to keep audit log entries
            "max_commands_age" => 480, // max time (minutes) to keep commands
            "max_events_age" => 480, // max time (minutes) to keep events
            "optimize_interval" => 60, // time (in minutes) between db optimization runs
            "repair_interval" => 0, // time (in minutes) between db repair runs
        ),
May I confirm that the PG DB stores audit related information and not probe checks information (which is stored in the MySQL?)? What does the scheduled backups from Nagios Admin settings cover?

Just to confirm, I can control retention data for the probe alarms from the max_statehistory_age settings?

Code: Select all

    "ndoutils" => array(
        "dbtype" => 'mysql',
        "dbserver" => 'localhost',
        "user" => 'ndoutils',
        "pwd" => 'n@gweb',
        "db" => 'nagios',
        "dbmaint" => array( // variables affecting maintenance of ndoutils db

            "max_externalcommands_age" => 7, // max time (in DAYS) to keep external commands
            "max_logentries_age" => 90, // max time (in DAYS) to keep log entries
            "max_statehistory_age" => 730, // max time (in DAYS) to keep state history information
            "max_notifications_age" => 90, // max time (in DAYS) to keep notifications
            "max_timedevents_age" => 5, // max time (minutes) to keep timed events
            "max_systemcommands_age" => 5, // max time (minutes) to keep system commands
            "max_servicechecks_age" => 5, // max time (minutes) to keep service checks
            "max_hostchecks_age" => 5, // max time (minutes) to keep host checks
            "max_eventhandlers_age" => 5, // max time (minutes) to keep event handlers
            "optimize_interval" => 60, // time (in minutes) between db optimization runs
            "repair_interval" => 0, // time (in minutes) between db repair runs
        ),
Lastly what would be the right way ensure postgres is started upon a server host reboot?

Thanks!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios XI UI showing database error after server reboot

Post by scottwilkerson »

sitaonair wrote:From the output it seems that pgsql is still in use for the setup, is it right to assume that the Nagios version was upgraded from 4.x?
That is correct. Nagios XI doesn't migrate users who had postgresql in use from initial install
sitaonair wrote:May I confirm that the PG DB stores audit related information and not probe checks information (which is stored in the MySQL?)? What does the scheduled backups from Nagios Admin settings cover?
Postgresql just hosts user and system settings and none of the probe check results. This database is also backed up as part of the scheduled backups and would be restored on restore.
sitaonair wrote:Just to confirm, I can control retention data for the probe alarms from the max_statehistory_age settings?
Actually, it would be best to change the retention time in Admin -> Performance Settings -> Database Tab

This would override the setting you see in the config.
sitaonair wrote:Lastly what would be the right way ensure postgres is started upon a server host reboot?
Run the following from the command line:

Code: Select all

chkconfig postgresql on
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked