fused server issue after upgrade to 4.1.8

This support forum board is for questions relating to Nagios Fusion.
jiec168
Posts: 26
Joined: Thu May 19, 2016 10:55 am

fused server issue after upgrade to 4.1.8

Post by jiec168 »

Hi,

I recently upgraded nagiosfusion to 4.1.8 early but didn't notice the fused server issue until now. Nagiosfusion is not getting any updated data from any of the nagios core servers.

When I open an fused server setting, the test fusion setting can return a green check. But the bar shows problem on that same fused server.
Screen Shot 2020-01-02 at 10.33.49 AM (3).png
When I follow and click that problem fused server, it shows a page without any info. But if I fill out the page with the correct info and save, it complains that the setting already exists.
Screen Shot 2020-01-02 at 10.41.18 AM (3).png
Any idea? Wondering if the new version didn't get the original fused server settings after upgrade. Thanks.
You do not have the required permissions to view the files attached to this post.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: fused server issue after upgrade to 4.1.8

Post by mbellerue »

The first thing I'm wondering about is whether it carried your license over. Can you go to Admin -> License Information and show us a screenshot? You can black out the actual license key if you like, but I want to make sure that the License Stats is good.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
jiec168
Posts: 26
Joined: Thu May 19, 2016 10:55 am

Re: fused server issue after upgrade to 4.1.8

Post by jiec168 »

I thought the license is perpetual as long as I use the existing good version, no? We didn't renew our license. it expired in mid December.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: fused server issue after upgrade to 4.1.8

Post by tgriep »

The screen capture for the URL is cut off so make sure it is set like the following

Code: Select all

https://servername/nagios/
Replace servername with the FQDN.

Then for the CGI Path: set it to the following

Code: Select all

/cgi-bin/
See if that fixes the issue.
Be sure to check out our Knowledgebase for helpful articles and solutions!
jiec168
Posts: 26
Joined: Thu May 19, 2016 10:55 am

Re: fused server issue after upgrade to 4.1.8

Post by jiec168 »

Yes, the URL is in FQDN format. In our case of nagios core installation, we use /nagios/cgi-bin. Otherwise the click of test button wouldn't have returned green check status.

This is the full path when we trigger cgi script.
https://nagios.smf.uc.int/nagios/cgi-bi ... e=overview

This is our license status.
Screen Shot 2020-01-02 at 2.35.54 PM (3).png
I suspect that the upgrade didn't pick up the original full fused server setting, only partly like the names of different data centers. The upgrade was done not last week ( I lied ) but early December before the support license expired, which we had for two years. Otherwise I would have not been able to do the upgrade with expired license.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: fused server issue after upgrade to 4.1.8

Post by tgriep »

Login to the Fusion GUI and go to the Admin > System Settings menu.
Change the Log Level to Debug and check the boxes for these 2 options.
Enable writing log data to the specified file
Enable writing debug data to the specified file.

Update the Settings and wait for 10 to 15 minutes for a poll to happen.

Login as root to the Fusion server and go to the following folder.

Code: Select all

/usr/local/nagiosfusion/var/log
Check the auth_subsys.log, fusion.log and the fusion.debug files for ant authentication errors.

From what I know, the license expiring should not stop the polling of the data from the remote servers.
It will not allow anymore upgrades though.
Be sure to check out our Knowledgebase for helpful articles and solutions!
jiec168
Posts: 26
Joined: Thu May 19, 2016 10:55 am

Re: fused server issue after upgrade to 4.1.8

Post by jiec168 »

All logs stopped right after the upgrade. I don't see any debug log there and there is no error message from any of the below logs either. Everything just stopped after the upgrade. apache is running as apache and nagiosfusion is running as nagios. does that matter? I have tried changing apache running as nagios as well as change running apache back to as apache but with nagiosfusion chown to apache user and group. But none of them worked either.

-rw-r--r--. 1 nagios nagios 5093631 Dec 8 15:55 dbmaint_subsys.log
-rw-r--r--. 1 nagios nagios 3330 Dec 8 15:59 poll_subsys.4.nagiosfusion.log
-rw-r-----. 1 nagios nagios 547369 Dec 8 15:59 auth_subsys.log
-rw-r--r--. 1 nagios nagios 2788 Dec 8 15:59 poll_subsys.5.nagiosfusion.log
-rw-r--r--. 1 nagios nagios 3059 Dec 8 15:59 poll_subsys.2.nagiosfusion.log
-rw-r--r--. 1 nagios nagios 4953 Dec 8 15:59 poll_subsys.3.nagiosfusion.log
-rw-r--r--. 1 nagios nagios 4218 Dec 8 15:59 poll_subsys.1.nagiosfusion.log
-rw-r-----. 1 nagios nagios 714363 Dec 8 16:00 poll_subsys.log-20191209.gz
-rw-r-----. 1 nagios nagios 864681 Dec 8 16:00 log_subsys.log-20191209.gz
-rw-r-----. 1 nagios nagios 4717279 Dec 8 16:00 cmd_subsys.log
-rw-r-----. 1 nagios nagios 4938443 Dec 8 16:00 sysstat_subsys.log
-rw-r-----. 1 nagios nagios 0 Dec 9 03:30 log_subsys.log
-rw-r-----. 1 nagios nagios 0 Dec 9 03:30 poll_subsys.log

root@smf-tools-nagiosfusion-001:/usr/local/nagiosfusion/var/log# ps -ef | grep httpd
root 5265 1 0 16:46 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 5268 5265 3 16:46 ? 00:00:22 /usr/sbin/httpd -DFOREGROUND
apache 5269 5265 2 16:46 ? 00:00:19 /usr/sbin/httpd -DFOREGROUND
...........
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: fused server issue after upgrade to 4.1.8

Post by tgriep »

Go to the Admin > System Status menu and in the Subsystem Status dashlet, look at the Polling Locks: option, if it not zero, click on the Gear and clear the locks.

Because the files have not been updated, I would guess that the crond daemon is not running so the polling jobs are not getting the information and updating the logs.

Login as root and run this to restart crond.

Code: Select all

service crond start
Wait for a bit and see if the log files start to update and if the GUI shows a good status.

If not, run the following commands as root and post the output to the ticket.

Code: Select all

ps -ef --cols=300 |grep nagios
tail -100 /var/log/cron
tail -100 /var/log/messages
Be sure to check out our Knowledgebase for helpful articles and solutions!
jiec168
Posts: 26
Joined: Thu May 19, 2016 10:55 am

Re: fused server authentication issue

Post by jiec168 »

The polling lock was 0 when I looked.

I also looked at the /var/log/cron, it was running but it had the error since December 8. I restarted the crond as suggested but still doesn't work.

Dec 8 15:59:02 smf-tools-nagiosfusion-001 CROND[19990]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosfusion/cron/auth
_subsys.php --max-time=60 >>/usr/local/nagiosfusion/var/log/auth_subsys.log 2>&1)
Dec 8 15:59:02 smf-tools-nagiosfusion-001 CROND[19989]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosfusion/cron/cmd_
subsys.php --max-time=60 >>/usr/local/nagiosfusion/var/log/cmd_subsys.log 2>&1)
Dec 8 15:59:02 smf-tools-nagiosfusion-001 CROND[19994]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosfusion/cron/poll
_subsys.php --max-time=60 --master-poll >>/usr/local/nagiosfusion/var/log/poll_subsys.log 2>&1)
Dec 8 16:00:01 smf-tools-nagiosfusion-001 CROND[20306]: (root) CMD (/scripts/audit_rotate.sh > /dev/null 2>&1)
Dec 8 16:00:01 smf-tools-nagiosfusion-001 crond[20284]: (nagios) PAM ERROR (Authentication token is no longer valid; new one required)
Dec 8 16:00:01 smf-tools-nagiosfusion-001 crond[20284]: (nagios) FAILED to authorize user with PAM (Authentication token is no longer valid; new one required)
Dec 8 16:00:01 smf-tools-nagiosfusion-001 crond[20283]: (nagios) PAM ERROR (Authentication token is no longer valid; new one required)
..........

It turns out that the nagiosfusion stopped pulling data from remote nagios servers before the upgrade on December 17. So it is not upgrade related issue. The problem started before the upgrade.

Question: what user does the nagiosfusion cron use to pull data from remote nagios servers? Is it the one we set up in fused server setting? I can get a green check from that test as well as I can use the same login password to log into the remote nagios servers though.
Screen Shot 2020-01-05 at 6.05.47 PM.png
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9177
Joined: Thu Oct 30, 2014 9:02 am

Re: fused server issue after upgrade to 4.1.8

Post by tgriep »

The account that cron uses is the nagios user account.

The PAM error for crond could be caused by a few things.

1. The nagios account / password was expired.
Run this as root to determine if the account is still active.

Code: Select all

chage -l nagios
If should output something like this showing the account is active and the password is nor set to expire.

Code: Select all

Last password change                                    : Nov 20, 2018
Password expires                                        : never
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 7
If the user account is expired, run this as root to enable it.

Code: Select all

passwd --delete nagios
chage -I -1 -m 0 -M 99999 -E -1 nagios
2. The /etc/host.allow or the /etc/hosts.deny was changed to not allow nagios to run cron jobs. Check them and adjust them if necessary.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked