SIGSEV nagios going down.

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
maytheforcebeprosper
Posts: 6
Joined: Tue Aug 09, 2016 2:54 pm

SIGSEV nagios going down.

Post by maytheforcebeprosper »

Background:

Built a new machine
24GB /Ram
4 cores 2Ghz
Nagios 4.2
Rhel 6.8

Compiled from source.

after running nagios
"service nagios start" I get this in the logs

Code: Select all

[1471296780] HOST ALERT:server.com;DOWN;SOFT;1;FPING CRITICALserver.com (loss=100% )
[1471296780] HOST ALERT:server.com;DOWN;SOFT;1;FPING CRITICAL - server.com (loss=100% )
[1471296780] HOST ALERT:server.com;DOWN;SOFT;1;FPING CRITICAL - host is unreachable
[1471296780] HOST ALERT:server.com;DOWN;SOFT;1;FPING CRITICAL -server.com (loss=100% )
[1471296780] HOST ALERT: server.com;DOWN;SOFT;1;FPING CRITICAL - server.com (loss=100% )
[1471296780] HOST ALERT: server.comDOWN;SOFT;1;FPING CRITICAL - server.com(loss=100% )
[1471296780] Caught SIGSEGV, shutting down...

happens about 30 secs into start. any ideas on how I can tshoot this?
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: SIGSEV nagios going down.

Post by tmcdonald »

I'd start with something like this:

Code: Select all

cd /usr/local/nagios/bin/
strace ./nagios ../etc/nagios.cfg
Adjust to meet your paths, and you might need to install strace first. This should give us an idea of what is happening right before it crashes.
Former Nagios employee
maytheforcebeprosper
Posts: 6
Joined: Tue Aug 09, 2016 2:54 pm

Re: SIGSEV nagios going down.

Post by maytheforcebeprosper »

Code: Select all

write(15, "-vlan1161-1.net.domain.com\n\tin"..., 4096) = 4096
write(15, "\n\ndefine hostdependency {\n\thost_"..., 4096) = 4096
write(15, "law-gw-1-vlan1683-1.net.domain"..., 4096) = 4096
write(15, "s\td,u\n\t}\n\ndefine hostdependency "..., 4096) = 4096
write(15, "aw-gw-1-vlan2866-1.net.domain."..., 4096) = 4096
write(15, "\td,u\n\t}\n\ndefine hostdependency {"..., 4096) = 4096
write(15, "t_name\tlaw-gw-1-vlan3254-1.net.c"..., 4096) = 4096
write(15, "\td,u\n\t}\n\ndefine hostdependency {"..., 4096) = 4096
write(15, "7-1.net.domain.com\n\tinherits_p"..., 4096) = 4096
write(15, "du\n\tinherits_parent\t0\n\tnotificat"..., 4096) = 4096
write(15, "\thost_name\tphi-gw-1.net.domain"..., 4096) = 4096
write(15, "a.edu\n\tinherits_parent\t0\n\texecut"..., 4096) = 4096
write(15, "rits_parent\t0\n\tnotification_fail"..., 4096) = 4096
write(15, "hostdependency {\n\thost_name\tsyra"..., 4096) = 4096
write(15, "\tdependent_host_name\tsyracuse-gw"..., 4096) = 4096
write(15, "umbia.edu\n\tdependent_host_name\tw"..., 4096) = 4096
write(15, "ent\t0\n\texecution_failure_options"..., 4096) = 4096
write(15, "t.domain.com\n\tdependent_host_n"..., 4096) = 4096
write(15, "arent\t0\n\tnotification_failure_op"..., 4096) = 4096
write(15, "-gw-1.net.domain.com\n\tdependen"..., 4096) = 4096
write(15, "its_parent\t0\n\texecution_failure_"..., 4096) = 4096
write(15, "e\twat-gw-1.net.domain.com\n\tdep"..., 4096) = 4096
write(15, "u\n\tinherits_parent\t0\n\tnotificati"..., 4096) = 4096
write(15, "-1.net.domain.com\n\tdependent_h"..., 4096) = 4096
write(15, "on_failure_options\td,u\n\t}\n\ndefin"..., 3164) = 3164
close(15)                               = 0
munmap(0x7fa64a5bd000, 4096)            = 0
unlink("/var/nagios/status.dat")        = -1 ENOENT (No such file or directory)
open("/var/nagios/retention.dat", O_RDONLY) = 15
fstat(15, {st_mode=S_IFREG|0600, st_size=55192, ...}) = 0
mmap(NULL, 55192, PROT_READ, MAP_PRIVATE, 15, 0) = 0x7fa64a5b0000
munmap(0x7fa64a5b0000, 55192)           = 0
close(15)                               = 0

That's where it dies,

I'll look into the /var/nagios/status.dat

Is there something I might be missing? More output needed?
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: SIGSEV nagios going down.

Post by tmcdonald »

Alright, so it doesn't appear to be the main process. Must be a child then.

strace -f -s 256 ./nagios ../etc/nagios.cfg

Otherwise you might need to enable a core dump for analysis:

http://antmeetspenguin.blogspot.com/201 ... -dump.html

This sort of thing is not easy to track down over a forum, so apologies in advance.
Former Nagios employee
Locked