Page 1 of 3

ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 8:59 am
by JakeHatMacys
Nagios.log is saying:

[1506951587] Successfully launched command file worker with pid 22230
[1506952264] ndomod: Error writing to data sink! Some output may get lost...
[1506952264] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1506952328] ndomod: Successfully reconnected to data sink! 0 items lost, 282 queued items to flush.
[1506952328] ndomod: Successfully flushed 282 queued items to data sink.

XI status is stuck like this and my states aren't updating properly on the GUI:
Capture.JPG
Currently on version 5.2.

I've tried restarting a few things but nothing seems to clear this. Any help is appreciated.

Thanks.

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 10:19 am
by tmcdonald
Usually that is indicative of cron not running. You can restart it with service crond restart.

Barring that, is your disk full? df -h to confirm. Can also look at the logs for some of those system components, see if anything sticks out:

Code: Select all

tail -20 /usr/local/nagiosxi/var/dbmaint.log
tail -20 /usr/local/nagiosxi/var/sysstat.log
tail -20 /usr/local/nagiosxi/var/cmdsubsys.log
Run those as root and post the results here.

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 10:37 am
by scottwilkerson
This looks like either crond isn't running

Code: Select all

service crond status
Or the nagios user is expired

Code: Select all

chage -l nagios
If fixing either of these doesn't work, can you post the results of

Code: Select all

tail -100 /var/log/cron
and a view of the Nagios XI cron file

Code: Select all

cat /etc/cron.d/nagiosxi

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 1:41 pm
by JakeHatMacys
scottwilkerson wrote:This looks like either crond isn't running

Code: Select all

service crond status
Or the nagios user is expired

Code: Select all

chage -l nagios
If fixing either of these doesn't work, can you post the results of

Code: Select all

tail -100 /var/log/cron
and a view of the Nagios XI cron file

Code: Select all

cat /etc/cron.d/nagiosxi
Cron is running:
var]$ service crond status
crond (pid 2783) is running...
File system space is fine, should of said that before sorry.
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/localvg00-lv_slash
7.7G 656M 6.7G 9% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
/dev/mapper/localvg00-lv_archive
2.0G 178M 1.7G 10% /archive
/dev/sda1 194M 90M 94M 49% /boot
/dev/mapper/localvg00-lv_home
4.0G 145M 3.6G 4% /home
/dev/mapper/localvg00-lv_opt
197G 402M 187G 1% /opt
/dev/mapper/localvg00-lv_tmp
4.0G 267M 3.5G 7% /tmp
/dev/mapper/localvg00-lv_usr
63G 2.5G 57G 5% /usr
/dev/mapper/localvg00-lv_usr_local
62G 39G 20G 67% /usr/local
/dev/mapper/localvg00-lv_var
63G 2.7G 57G 5% /var
$ chage -l nagios
Last password change : Dec 08, 2014
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
I don't have access to the tail... trying to get it now.

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 1:44 pm
by JakeHatMacys
tmcdonald wrote:Usually that is indicative of cron not running. You can restart it with service crond restart.

Barring that, is your disk full? df -h to confirm. Can also look at the logs for some of those system components, see if anything sticks out:

Code: Select all

tail -20 /usr/local/nagiosxi/var/dbmaint.log
tail -20 /usr/local/nagiosxi/var/sysstat.log
tail -20 /usr/local/nagiosxi/var/cmdsubsys.log
Run those as root and post the results here.
So that's the other strange thing... all the /usr/local/nagiosxi/var/ logs are blank looks like.... I do not have root though just "sudo - su Nagios" user access but we've pretty much got root to do most stuff for the application via sudo.

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 1:57 pm
by JakeHatMacys
So even though Cron is running the output of the tail command is:
Oct 2 14:45:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:45:01 e******* CROND[15155]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok)
Oct 2 14:46:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:47:02 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:48:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:49:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:50:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:50:01 e******* CROND[19958]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok)
Oct 2 14:50:01 e******* CROND[19959]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 2 14:51:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:52:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:53:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() failed)
Oct 2 14:54:01 e******* crond[2783]: (/usr/bin/php) ERROR (getpwnam() fail
That definitely looks like a problem.

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 2:05 pm
by tmcdonald
You are likely missing a column in your /etc/cron.d/nagiosxi file specifying to run the command as the root user. Can you check your file and make sure it looks something like this?

Code: Select all

# /etc/cron.d/nagiosxi: crontab fragment for nagiosxi

# Backup MySQL & PostgreSQL Databases
0   7 * * * root   /root/scripts/automysqlbackup
0   7 * * * root   /root/scripts/autopostgresqlbackup > /dev/null 2>&1

*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php >> /usr/local/nagiosxi/var/sysstat.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php >> /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php >> /usr/local/nagiosxi/var/eventman.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/event_handler.php >> /usr/local/nagiosxi/var/event_handler.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php >> /usr/local/nagiosxi/var/feedproc.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php >> /usr/local/nagiosxi/var/perfdataproc.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/nom.php >> /usr/local/nagiosxi/var/nom.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php >> /usr/local/nagiosxi/var/reportengine.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php >> /usr/local/nagiosxi/var/dbmaint.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php >> /usr/local/nagiosxi/var/cleaner.log 2>&1
01  * * * * nagios /usr/local/nagiosxi/cron/recurringdowntime.pl >> /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
*   * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php >> /usr/local/nagiosxi/var/deadpool.log 2>&1
If it does, then your cron daemon may have issues looking up users for some reason. Are there any security modifications made on this system?

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 2:07 pm
by scottwilkerson
You didn't show this which is where the missing cron jobs live

Code: Select all

cat /etc/cron.d/nagiosxi

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 3:08 pm
by JakeHatMacys
scottwilkerson wrote:You didn't show this which is where the missing cron jobs live

Code: Select all

cat /etc/cron.d/nagiosxi
# /etc/cron.d/nagiosxi: crontab fragment for nagiosxi

# Backup MySQL & PostgreSQL Databases
0 7 * * * root /root/scripts/automysqlbackup
0 8 * * * root /root/scripts/autopostgresqlbackup

* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/dbmaint.php > /usr/local/nagiosxi/var/dbmaint.log 2>&1
* * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1
01 * * * * nagios /usr/local/nagiosxi/cron/recurringdowntime.pl > /usr/local/nagiosxi/var/recurringdowntime.log 2>&1
*/5 * * * * nagios /usr/bin/php -q /usr/local/nagiosxi/cron/deadpool.php > /usr/local/nagiosxi/var/deadpool.log 2>&1

Re: ndomod: Error writing to data sink!

Posted: Mon Oct 02, 2017 3:38 pm
by scottwilkerson
For some reason your system cannot run any of these crons, it is likely a permissions error

Lets send back the results of running the following

Code: Select all

sudo su -c '/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1'
ls -la /usr/bin/php
ls -la /usr/local/nagiosxi/cron
ls -la /usr/local/nagiosxi/var