Page 2 of 2

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Wed Jun 15, 2011 11:24 am
by arnab.roy
OK since i got very little help from your end ! I went ahead and did the following :

1. I upgraded nagios hoping it would reset in case of any permission issues- Nothing happened
2. That fact i couldnt see services for a particular host raised my suspicioun-went ahead and deleted it- Got deleted from CCM but not from Nagios XI
3. Restarted all services no luck whatsoever.
4. Rebooted the whole system and now that ghost host is gone, and the system is back to normal.

Althought I have solved the problem I am quiet sure the system got caught up in some kind of bug.

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Wed Jun 15, 2011 11:37 am
by arnab.roy
OK ...I now have a pattern adding that host back caused the system to crash again !!!

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Wed Jun 15, 2011 11:46 am
by arnab.roy
Its something in that host group adding any hosts to that group causes it to crash

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Wed Jun 15, 2011 3:25 pm
by nscott
Is there a way for you to isolate the host that is causing the issue?

Also, the segmentation faults start at 14:00 which coincide. Can you manually connect to the psql database?

psql -U nagiosxi -W -d nagiosxi
password: n@gweb

It appears its having difficulty connecting, not sure what the cause of that is, so lets start there.

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Wed Jun 15, 2011 5:25 pm
by arnab.roy
Hi All,

I have found a major bug in the system the root cause of this issue was a debug that I had enabled on one of my own plugin which was dumping out a table like this :

Code: Select all

WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....[.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....[.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....[.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....[.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....\.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....\.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....\.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....\.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..... ' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'.....!' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'.....(' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'.....)' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....#.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....#.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....#.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....#.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....$.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....$.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....$.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....$.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....&.' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....&.' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....&.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....&.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....)@' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....)A' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....)H' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....)I' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....,.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....,.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....,.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....,.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'......' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../@' = INTEGER: 24
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../A' = INTEGER: 24
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../H' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../I' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../.' = INTEGER: 36
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../.' = INTEGER: 48
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'..../.' = INTEGER: 48
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0`' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0a' = INTEGER: 30
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0h' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0i' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....0.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....1.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....1.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....1.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....1.' = INTEGER: 60
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....D.' = INTEGER: 42
WLSX-SWITCH-MIB::apSignalToNoiseRatio.'....D.' = INTEGER: 42

This resulted the system to crash once I turned the debug off and it stopped dumping this output into nagios it started to work normally. Food for thought for your developers :)

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Wed Jun 15, 2011 5:52 pm
by tonyyarusso
There are two issues here. The first is that Nagios didn't intelligently handle the output of your plugin to limit it nicely. The second is that Red Hat / CentOS have a broken Apache that segfaults when there is a problem instead of nicely throwing an error. The latter is outside of our control. The former is the reason that by default Nagios has a limit on plugin output length so that a broken plugin such as yours can not damage the system. However, you choose to bypass that protection (see your previous thread on http://support.nagios.com/forum/viewtopic.php?t=2450), which is why you encountered this problem when your plugin went haywire.

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Thu Jun 16, 2011 10:41 am
by arnab.roy
Hi Tony,

I didn't make any changes to the default output lengths , I would like to highlight only the xi interface broke down not nagios / nagios core as everything was working fine when accessed via nagioscore

Many Thanks
Arnab

Re: CRITICAL! NAGIOS XI SEGMENTATION FAULT

Posted: Fri Jun 17, 2011 10:46 am
by mguthrie
Just for our own future reference, can show us the output from running that check from the command-line with the debugging turned on?

Also, can you show us the exit code for that plugin after running it like that?

Code: Select all

echo $?
Its always good to know "why" things break : )