Nagios XI Services not working
Re: Nagios XI Services not working
I am having exactly the same problem. I updated to latest version of XI this morning rebooted the host. RHEL6 box.
from nagios.log
wproc: Registry request: name=Core Worker 32367;pid=32367
wproc: Registry request: name=Core Worker 32368;pid=32368
wproc: Registry request: name=Core Worker 32369;pid=32369
Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
from var/log/messages
Segmentation fault
Jan 24 09:32:02 monitor01 ndo2db: Trimming eventhandlers.
Jan 24 09:32:03 monitor01 kernel: nagios[32320]: segfault at 372a000 ip 000000391927f791 sp 00007ffe1c8880d8 error 6 in libc-2.12.so[3919200000+18a000]
I'd love to hear what you find out!
from nagios.log
wproc: Registry request: name=Core Worker 32367;pid=32367
wproc: Registry request: name=Core Worker 32368;pid=32368
wproc: Registry request: name=Core Worker 32369;pid=32369
Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
from var/log/messages
Segmentation fault
Jan 24 09:32:02 monitor01 ndo2db: Trimming eventhandlers.
Jan 24 09:32:03 monitor01 kernel: nagios[32320]: segfault at 372a000 ip 000000391927f791 sp 00007ffe1c8880d8 error 6 in libc-2.12.so[3919200000+18a000]
I'd love to hear what you find out!
-
askewdread
- Posts: 69
- Joined: Wed Nov 16, 2016 4:54 pm
Re: Nagios XI Services not working
looks pretty much the same as ours, except ours is CentOS 7.3uidaho wrote:I am having exactly the same problem. I updated to latest version of XI this morning rebooted the host. RHEL6 box.
from nagios.log
wproc: Registry request: name=Core Worker 32367;pid=32367
wproc: Registry request: name=Core Worker 32368;pid=32368
wproc: Registry request: name=Core Worker 32369;pid=32369
Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
from var/log/messages
Segmentation fault
Jan 24 09:32:02 monitor01 ndo2db: Trimming eventhandlers.
Jan 24 09:32:03 monitor01 kernel: nagios[32320]: segfault at 372a000 ip 000000391927f791 sp 00007ffe1c8880d8 error 6 in libc-2.12.so[3919200000+18a000]
I'd love to hear what you find out!
Re: Nagios XI Services not working
Thanks for the retention.dat file. I did notice in it is a service check called "Check WMI Physical Disk IO" whose output data is getting cut off at 8192 bytes and maybe that is causing the issue.
For a test, can you disable that service check and see if the issue is resolved?
Thanks.
For a test, can you disable that service check and see if the issue is resolved?
Thanks.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
askewdread
- Posts: 69
- Joined: Wed Nov 16, 2016 4:54 pm
Re: Nagios XI Services not working
Hey,
thanks for that... i had to remove the retentions.dat file again that time but i suspect thats expected as it still had those checks inside it.... ill keep an eye on it and let you know
thanks for that... i had to remove the retentions.dat file again that time but i suspect thats expected as it still had those checks inside it.... ill keep an eye on it and let you know
-
dwhitfield
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: Nagios XI Services not working
Thanks for letting us know. We await results.
-
askewdread
- Posts: 69
- Joined: Wed Nov 16, 2016 4:54 pm
Re: Nagios XI Services not working
Hi,
unfortunately this has reoccured this morning, latest retention.dat attached
unfortunately this has reoccured this morning, latest retention.dat attached
You do not have the required permissions to view the files attached to this post.
Re: Nagios XI Services not working
Can you run a MySQL command for me? As root, from the command line:
echo "use nagios;select count(*) from nagios_servicestatus where LENGTH(output) >= 255;" | mysql -u root -pnagiosxi
You may need to change the last part if the password is not nagiosxi. I have a theory that the segfault might be related to long output being parsed, but we are not able to replicate this internally to test.
echo "use nagios;select count(*) from nagios_servicestatus where LENGTH(output) >= 255;" | mysql -u root -pnagiosxi
You may need to change the last part if the password is not nagiosxi. I have a theory that the segfault might be related to long output being parsed, but we are not able to replicate this internally to test.
Former Nagios employee
-
askewdread
- Posts: 69
- Joined: Wed Nov 16, 2016 4:54 pm
Re: Nagios XI Services not working
it comes back:tmcdonald wrote:Can you run a MySQL command for me? As root, from the command line:
echo "use nagios;select count(*) from nagios_servicestatus where LENGTH(output) >= 255;" | mysql -u root -pnagiosxi
You may need to change the last part if the password is not nagiosxi. I have a theory that the segfault might be related to long output being parsed, but we are not able to replicate this internally to test.
Code: Select all
count(*)
0
Re: Nagios XI Services not working
Can you run this command and post the output so we can see if the MYSQL table settings are correct?
Can you either post or PM me the full /var/log/messages and the /usr/local/nagios/var/nagios.log files so we can view them?
Code: Select all
echo 'desc nagios_servicestatus;' | mysql -t -pnagiosxi nagiosBe sure to check out our Knowledgebase for helpful articles and solutions!
-
askewdread
- Posts: 69
- Joined: Wed Nov 16, 2016 4:54 pm
Re: Nagios XI Services not working
mysqltgriep wrote:Can you run this command and post the output so we can see if the MYSQL table settings are correct?Can you either post or PM me the full /var/log/messages and the /usr/local/nagios/var/nagios.log files so we can view them?Code: Select all
echo 'desc nagios_servicestatus;' | mysql -t -pnagiosxi nagios
Code: Select all
+-------------------------------+--------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------------+--------------+------+-----+---------------------+----------------+
| servicestatus_id | int(11) | NO | PRI | NULL | auto_increment |
| instance_id | smallint(6) | NO | MUL | 0 | |
| service_object_id | int(11) | NO | UNI | 0 | |
| status_update_time | datetime | NO | MUL | 0000-00-00 00:00:00 | |
| output | varchar(255) | NO | | | |
| long_output | text | NO | | NULL | |
| perfdata | text | NO | | NULL | |
| current_state | smallint(6) | NO | MUL | 0 | |
| has_been_checked | smallint(6) | NO | | 0 | |
| should_be_scheduled | smallint(6) | NO | | 0 | |
| current_check_attempt | smallint(6) | NO | | 0 | |
| max_check_attempts | smallint(6) | NO | | 0 | |
| last_check | datetime | NO | | 0000-00-00 00:00:00 | |
| next_check | datetime | NO | | 0000-00-00 00:00:00 | |
| check_type | smallint(6) | NO | MUL | 0 | |
| last_state_change | datetime | NO | MUL | 0000-00-00 00:00:00 | |
| last_hard_state_change | datetime | NO | | 0000-00-00 00:00:00 | |
| last_hard_state | smallint(6) | NO | | 0 | |
| last_time_ok | datetime | NO | | 0000-00-00 00:00:00 | |
| last_time_warning | datetime | NO | | 0000-00-00 00:00:00 | |
| last_time_unknown | datetime | NO | | 0000-00-00 00:00:00 | |
| last_time_critical | datetime | NO | | 0000-00-00 00:00:00 | |
| state_type | smallint(6) | NO | MUL | 0 | |
| last_notification | datetime | NO | | 0000-00-00 00:00:00 | |
| next_notification | datetime | NO | | 0000-00-00 00:00:00 | |
| no_more_notifications | smallint(6) | NO | | 0 | |
| notifications_enabled | smallint(6) | NO | MUL | 0 | |
| problem_has_been_acknowledged | smallint(6) | NO | MUL | 0 | |
| acknowledgement_type | smallint(6) | NO | | 0 | |
| current_notification_number | smallint(6) | NO | | 0 | |
| passive_checks_enabled | smallint(6) | NO | MUL | 0 | |
| active_checks_enabled | smallint(6) | NO | MUL | 0 | |
| event_handler_enabled | smallint(6) | NO | MUL | 0 | |
| flap_detection_enabled | smallint(6) | NO | MUL | 0 | |
| is_flapping | smallint(6) | NO | MUL | 0 | |
| percent_state_change | double | NO | MUL | 0 | |
| latency | double | NO | MUL | 0 | |
| execution_time | double | NO | MUL | 0 | |
| scheduled_downtime_depth | smallint(6) | NO | MUL | 0 | |
| failure_prediction_enabled | smallint(6) | NO | | 0 | |
| process_performance_data | smallint(6) | NO | | 0 | |
| obsess_over_service | smallint(6) | NO | | 0 | |
| modified_service_attributes | int(11) | NO | | 0 | |
| event_handler | varchar(255) | NO | | | |
| check_command | varchar(255) | NO | | | |
| normal_check_interval | double | NO | | 0 | |
| retry_check_interval | double | NO | | 0 | |
| check_timeperiod_object_id | int(11) | NO | | 0 | |
+-------------------------------+--------------+------+-----+---------------------+----------------+