Hello,
We successfully updated our Nagios XI server from 5.4.13 to 5.5.4 yesterday. The only 'update' issue we had was that we check_mssql_database.py stopped working for our mssql database checks. We restored the original script from a backup after which it worked again. We are still investgating what has changed that caused it to stop working.
Thanks for the long awaited NagVis integration.
What I noticed in the new NagVis:
- I seem unable to change some settings on migrated maps. After recreating the map everything seems to work as expected.
- Some settings in the right-click context menu don't work
> Schedule downtime => The requested URL /nagiosxi/includes/components/xicore/cmd.cgi was not found on this server.
> Re-Schedule next Check => The requested URL /nagiosxi/includes/components/xicore/cmd.cgi was not found on this server.
> Acknowledge => The requested feature is not available for this backend. The MKLivestatus backend supports this feature.
Please let me know what is supposed to work and what not.
Grtz
Willem
Nagios XI 5.5.4 - Load Issues (ipcs queue)
Nagios XI 5.5.4 - Load Issues (ipcs queue)
You do not have the required permissions to view the files attached to this post.
Last edited by WillemDH on Fri Oct 12, 2018 9:28 am, edited 1 time in total.
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: Nagios XI 5.5.4
What settings are you talking about? Can you describe in details how you migrated the NagVis maps, and what settings you tried to change? We will try to recreate the issue in-house.- I seem unable to change some settings on migrated maps. After recreating the map everything seems to work as expected.
I was able to recreate all of the three issues, and filed an internal bug report (task_id=13666) for updating/fixing the NagVis component's URLs.- Some settings in the right-click context menu don't work
> Schedule downtime => The requested URL /nagiosxi/includes/components/xicore/cmd.cgi was not found on this server.
> Re-Schedule next Check => The requested URL /nagiosxi/includes/components/xicore/cmd.cgi was not found on this server.
> Acknowledge => The requested feature is not available for this backend. The MKLivestatus backend supports this feature.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Nagios XI 5.5.4
Ludmill,
Unfortunately I seem to be getting load issues after all. The server worked fine for 2 days, where we did several apply configurations. 1 hour ago however I added 1 service to a host and after applying things didn't came up as usual. The ipcs queue stayed above 240k and didn't seem to lower.
After 1 hour, the ipcs queue still didn't go any lower and I decided to shutdown the server and add 6 extra cpu's. After rebooting, the ipcs queue stayed very high. At this moment it's around 150k or so...
I already implemented all kinds of performance optimizations (reaper /php /ramdisk). Everything seemed to work well after the update, what could cause this sudden change in behaviour...?
CPU Load is also very low (+-4) for 16 CPU's. The only thing that's off is the message queue. I kept an eye on that the last few days, and it always went back to 0 about 2 minutes after an apply.
I see 5.5.5 has been released, is there any chance 1 of the fixed issues could cause the behaviour I'm describing above? My issue seems to be similar to https://support.nagios.com/forum/viewto ... cs#p263354
EDIT1: Tried another apply and same behaviour....
I will patch Monday to 5.5.5 hoping that fixes my issues.
EDIT2: Tried disabling BPI setting "Sync all hostgroups and servicegroups on apply config." => Same issue
EDIT3: Checking /var/log/messages I seem to find quite a few of these:
Grtz
Willem
Unfortunately I seem to be getting load issues after all. The server worked fine for 2 days, where we did several apply configurations. 1 hour ago however I added 1 service to a host and after applying things didn't came up as usual. The ipcs queue stayed above 240k and didn't seem to lower.
After 1 hour, the ipcs queue still didn't go any lower and I decided to shutdown the server and add 6 extra cpu's. After rebooting, the ipcs queue stayed very high. At this moment it's around 150k or so...
I already implemented all kinds of performance optimizations (reaper /php /ramdisk). Everything seemed to work well after the update, what could cause this sudden change in behaviour...?
CPU Load is also very low (+-4) for 16 CPU's. The only thing that's off is the message queue. I kept an eye on that the last few days, and it always went back to 0 about 2 minutes after an apply.
I see 5.5.5 has been released, is there any chance 1 of the fixed issues could cause the behaviour I'm describing above? My issue seems to be similar to https://support.nagios.com/forum/viewto ... cs#p263354
EDIT1: Tried another apply and same behaviour....
EDIT2: Tried disabling BPI setting "Sync all hostgroups and servicegroups on apply config." => Same issue
EDIT3: Checking /var/log/messages I seem to find quite a few of these:
Code: Select all
Oct 12 16:56:54 srvnagios ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_downtimehistory SET instance_id='1', downtime_type='2', object_id='60899', entry_time=FROM_UNIXTIME(1532597869), author_name='Claeys Stephen', comment_data='Citrix XenApp Template server voor Golden Image', internal_downtime_id='715114', triggered_by_id='0', is_fixed='1', duration='30988828800', scheduled_start_time=FROM_UNIXTIME(1514761200), scheduled_end_time=FROM_UNIXTIME(32503590000) ON DUPLICATE KEY UPDATE instance_id='1', downtime_type='2', object_id='60899', entry_time=FROM_UNIXTIME(1532597869), author_name='Cleys Stepen', comment_data='Citrix XenApp Template server voor Golden Image', internal_downtime_id='715114', triggered_by_id='0', is_fixed='1', duration='30988828800', scheduled_start_time=FROM_UNIXTIME(1514761200), scheduled_end_time=FROM_UNIXTIME(32503590000)'
Oct 12 16:56:54 srvnagios ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_scheduleddowntime SET instance_id='1', downtime_type='2', object_id='60899', entry_time=FROM_UNIXTIME(1532597869), author_name='Cleys Stepen', comment_data='Citrix XenApp Template server voor Golden Image', internal_downtime_id='715114', triggered_by_id='0', is_fixed='1', duration='30988828800', scheduled_start_time=FROM_UNIXTIME(1514761200), scheduled_end_time=FROM_UNIXTIME(32503590000) ON DUPLICATE KEY UPDATE instance_id='1', downtime_type='2', object_id='60899', entry_time=FROM_UNIXTIME(1532597869), author_name='Claeys Stephen', comment_data='Citrix XenApp Template server voor Golden Image', internal_downtime_id='715114', triggered_by_id='0', is_fixed='1', duration='30988828800', scheduled_start_time=FROM_UNIXTIME(1514761200), scheduled_end_time=FROM_UNIXTIME(32503590000)'Willem
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: Nagios XI 5.5.4 - Load Issues (ipcs queue)
Upgrading to Nagios XI 5.5.5 should resolve the issues with load and the BPI component.
Note: If you are still having issues after the upgrade, open a ticket via our support center - https://support.nagios.com/tickets/, and send us your latest profile.
https://www.nagios.com/downloads/nagios-xi/change-log/- Fixed user permissions on non-active objects causing large/slow SQL queries on some systems -JO
- Fixed status check for NDO in BPI component API tool so that it properly sleeps after each call -JO
Note: If you are still having issues after the upgrade, open a ticket via our support center - https://support.nagios.com/tickets/, and send us your latest profile.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Nagios XI 5.5.4 - Load Issues (ipcs queue)
Ludmill,
Seems like the # processes went up somewhere Friday morning, not sure if it is related, but wanted to ention anyway, as I can't explain which proccesses can explain this sudden rise.
So we updated to 5.5.4 last Wednessday around 11:00. The performance issues started on Friday.
Seems that since Friday morning, the number of processes went up from +- 350 to +- 475. What could cause this sudden spike in processes? I'm not immediately seeing in top what processes this could be..
EDIT 1: Updated to 5.5.5 aroun d 09:30. Did an apply config afterwards. Things don't seem to go much better, checking the message queue:
So it seems to take up to 6 minutes before the message queue calms down now... I will need to get this down to an acceptable level somehow. 2 minutes was just doable..
What is the main reason this queue gets so high? Is it the number of Nagios objects (hosts / services) or the number of checks /s?
pm'ed you a system profile.
Willem
Seems like the # processes went up somewhere Friday morning, not sure if it is related, but wanted to ention anyway, as I can't explain which proccesses can explain this sudden rise.
So we updated to 5.5.4 last Wednessday around 11:00. The performance issues started on Friday.
Seems that since Friday morning, the number of processes went up from +- 350 to +- 475. What could cause this sudden spike in processes? I'm not immediately seeing in top what processes this could be..
EDIT 1: Updated to 5.5.5 aroun d 09:30. Did an apply config afterwards. Things don't seem to go much better, checking the message queue:
Code: Select all
09:58:15 => Apply => 0 > 175000
09:59:15 => 140000
10:00:05 => 105000
10:01:15 => 75000
10:02:15 => 50000
10:03:15 => 25000
10:03:45 => 0What is the main reason this queue gets so high? Is it the number of Nagios objects (hosts / services) or the number of checks /s?
pm'ed you a system profile.
Willem
You do not have the required permissions to view the files attached to this post.
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: Nagios XI 5.5.4 - Load Issues (ipcs queue)
You opened a new support ticket in our system, so we will continue communicating via emails. I am locking this topic.
Be sure to check out our Knowledgebase for helpful articles and solutions!