Redhat 7
Nagios XI 5.4.10
Nagios is freezing on us in our production environment. It has been impacting other items in the environment like the checkresult queue which gets backed up as well. This appears to happen at random times during the day. What can I look at to see why the messages aren't clearing out?
------ Message Queues --------
key msqid owner perms used-bytes messages
0x8b000040 2031617 nagios 600 138176512 134938<---This number stops going down unless I restart ndo2db about 3 to 4 times
Also ndo2db appears to be freezing up once a week or less on the system I'm working with.
ndo2db queue not clearing
-
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: ndo2db queue not clearing
Definitely take a look at https://support.nagios.com/kb/article/n ... d-139.html
Also, can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
Also, can you PM me your Profile? You can download it by going to Admin > System Config > System Profile and click the ***Download Profile*** button towards the top. If for whatever reason you *cannot* download the profile, please put the output of View System Info (5.3.4+, Show Profile if older) in the thread (that will at least get us some info). This will give us access to many of the logs we would otherwise ask for individually. If security is a concern, you can unzip the profile take out what you like, and then zip it up again. We may end up needing something you remove, but we can ask for that specifically.
You can also generate a profile manually using the script at /usr/local/nagiosxi/html/includes/components/profile/getprofile.sh
That should generate a profile in /usr/local/nagiosxi/var/components/ which you can get off the server with an application such as FileZilla.
After you PM the profile, please update this thread. Updating this thread is the only way for it to show back up on our dashboard.
If you get an error that PROFILE BUILD FAILED, please see https://support.nagios.com/kb/article.p ... ategory=44
Re: ndo2db queue not clearing
Ok, attached is the profile. I've already seen that support article you listed and the kernel parameters were set prior to this.
kernel.msgmnb = 262144000
kernel.msgmax = 262144000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
kernel.msgmni = 512000
I'm not seeing any errors related to the queue in the nagios.log file either like the article talks about.
kernel.msgmnb = 262144000
kernel.msgmax = 262144000
kernel.shmmax = 4294967295
kernel.shmall = 268435456
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
kernel.msgmni = 512000
I'm not seeing any errors related to the queue in the nagios.log file either like the article talks about.
Re: ndo2db queue not clearing
I PMed you the profile.zip.
-
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: ndo2db queue not clearing
Those numbers don't look particularly high. Our article is fairly conservative. Can you try increasing them more? I've seen up to 10x the defaults on production XI systems.
Since both of your issues would seem to be performance related, it might be easier if you submit a ticket via https://support.nagios.com/tickets/ -- that's completely up to you, but we've got a couple of guys out on vacation and I think it would be easier to tackle as a single issue.
Since both of your issues would seem to be performance related, it might be easier if you submit a ticket via https://support.nagios.com/tickets/ -- that's completely up to you, but we've got a couple of guys out on vacation and I think it would be easier to tackle as a single issue.
Re: ndo2db queue not clearing
I doubled the numbers for now, but we're still experiencing the issues. I'll setup a support ticket.
-
- Former Nagios Staff
- Posts: 4583
- Joined: Wed Sep 21, 2016 10:29 am
- Location: NoLo, Minneapolis, MN
- Contact:
Re: ndo2db queue not clearing
Locking due to ticket received.