Nagios segfault shortly after startup

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rlacasse
Posts: 38
Joined: Tue Jun 12, 2012 12:59 pm

Nagios segfault shortly after startup

Post by rlacasse »

My Nagios system has been running fine for years using the downloaded VMware image. Yesterday, I upgraded to 5.4.0. Today, out of the blue, services stopped reporting. Upon investigation, the system is reporting that the nagios service isn't running. Restarting it from the NagiosXI has no effect.

I logged in to the backend and ran that manual startup "/usr/local/nagios/bin/nagios /usr/local/nagios/etc/nagios.cfg", this start up normally, runs a few seconds and segfaults.

I ran the same again with an strace and after many pages of output, I got this at the end:

Code: Select all

write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 519) = 519
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 522) = 522
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 553) = 553
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 500) = 500
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 488) = 488
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 533) = 533
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 535) = 535
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 528) = 528
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 520) = 520
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 520) = 520
write(20, "\n402:\n4=1483498445.664276\n174=vm"..., 508) = 508
write(20, "\n402:\n4=1483498445.664276\n174=ww"..., 481) = 481
write(20, "\n402:\n4=1483498445.664276\n174=ww"..., 483) = 483
write(20, "\n402:\n4=1483498445.664276\n174=ww"..., 508) = 508
write(20, "\n402:\n4=1483498445.664276\n174=ww"..., 511) = 511
write(20, "\n402:\n4=1483498445.664276\n174=ww"..., 477) = 477
write(20, "\n402:\n4=1483498445.664276\n174=ww"..., 482) = 482
write(20, "\n403:\n4=1483498445.664276\n220=AD"..., 109) = 109
write(20, "\n403:\n4=1483498445.664276\n220=AS"..., 281) = 281
write(20, "\n403:\n4=1483498445.664276\n220=Ba"..., 781) = 781
write(20, "\n403:\n4=1483498445.664276\n220=DR"..., 31828) = 31828
write(20, "\n403:\n4=1483498445.664276\n220=Me"..., 2738) = 2738
write(20, "\n403:\n4=1483498445.664276\n220=NC"..., 597) = 597
write(20, "\n403:\n4=1483498445.664276\n220=NF"..., 295) = 295
write(20, "\n403:\n4=1483498445.664276\n220=PB"..., 2023) = 2023
write(20, "\n403:\n4=1483498445.664276\n220=Po"..., 154) = 154
write(20, "\n403:\n4=1483498445.664276\n220=iC"..., 834) = 834
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Segmentation fault
Please help!
Last edited by dwhitfield on Mon Jan 09, 2017 10:04 am, edited 1 time in total.
Reason: marking with green check mark
rlacasse
Posts: 38
Joined: Tue Jun 12, 2012 12:59 pm

Re: Nagios segfault shortly after startup

Post by rlacasse »

The strace I posted is cryptic but what I'm seeing outside of the strace is the warnings about services, such as duplicates or missing notification emails and such, then host warnings, then segfault. Not sure what the nagios process is trying to accomplish at this stage so I don't know how to further troubleshoot the issue.

Any assistance is appreciated.

Thank you
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios segfault shortly after startup

Post by dwhitfield »

If you look through your objects.cache, or any of your .cfg files, do you see duplicates?

We can help look for duplicates, but to do that you'll need to PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the Download Profile button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info).

After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.

UPDATE: Profile shared with techs.
rlacasse
Posts: 38
Joined: Tue Jun 12, 2012 12:59 pm

Re: Nagios segfault shortly after startup

Post by rlacasse »

I've attached the results of the System Profile.

Going through the configuration and the object.cache, the only duplicates I found are expected and were present prior to the 5.4.0 upgrade.
Last edited by dwhitfield on Wed Jan 04, 2017 1:47 pm, edited 1 time in total.
Reason: removing profile for security purposes
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios segfault shortly after startup

Post by dwhitfield »

rlacasse wrote:what I'm seeing outside of the strace is the warnings about services, such as duplicates or missing notification emails and such, then host warnings, then segfault.
If you are seeing these in the GUI, could you post screenshots?

I'm noticing several files in your profile that have not been recently upgrading. Could you post your upgrade.log? Thanks!
rlacasse
Posts: 38
Joined: Tue Jun 12, 2012 12:59 pm

Re: Nagios segfault shortly after startup

Post by rlacasse »

I'm not seeing the warnings in the GUI, only when I look at the logs related to writing the configuration but as I've said, I know I have duplicates and they're on purpose.

Attached is the upgrade log.

Thank you,
You do not have the required permissions to view the files attached to this post.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios segfault shortly after startup

Post by dwhitfield »

Your upgrade.log makes it look like you were upgrading from 5.1.8 but I've seen that in a couple recently, which makes me thing that might not be accurate. From what version were you upgrading? Could you roll back and then try upgrading to 5.3.4 (https://assets.nagios.com/downloads/nag ... 3.4.tar.gz). Then you could upgrade to 5.4.0. I'm just wondering if we didn't catch a bug upgrading from earlier versions.
rlacasse
Posts: 38
Joined: Tue Jun 12, 2012 12:59 pm

Re: Nagios segfault shortly after startup

Post by rlacasse »

I've performed the recommended steps, reverted to 5.2.7, manually upgraded to 5.3.4, then used the GUI to upgrade to 5.4.

The upgrade was successful as expect.

I've manually restarted the nagios service post upgrade and didn't have any issues. I'll keep an eye on it now for a few days to see if the problem re-occurs. Last time it took a little over 24 hours before there was any issue.

Thank you for your assistance.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Nagios segfault shortly after startup

Post by dwhitfield »

We'll be closed in 24 hours for the weekend, but definitely on Monday we can resume things if you run into issues.

Glad things look like they are working so far!
rlacasse
Posts: 38
Joined: Tue Jun 12, 2012 12:59 pm

Re: Nagios segfault shortly after startup

Post by rlacasse »

No issues over the weekend, that appears to have resolved the issue.

Many thanks!
Locked