Page 1 of 1
Scheduling Problem after Apply Configuration
Posted: Wed Jun 17, 2015 6:00 pm
by rajasegar
We are at a stage where every time we apply Configuration, the scheduling screws up.
Need to Apply a few times before it is ok.
The root cause seems to be ipcq -q shows messages queue to be more than 100k.
When it stabilises after a few hours it reaches to be a few hundred to thousand only
Code: Select all
[nagios@nagiosprodxi1 ~]$ ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xbf000002 3440640 nagios 600 1914139 2342
Please advice when this issue is going to be fixed.
Re: Scheduling Problem after Apply Configuration
Posted: Thu Jun 18, 2015 1:39 pm
by tmcdonald
How many hosts and services do you have total? When you apply configuration there is a lot being written to the NDO database, and when we see problems is is usually the number of messages that shoots up, not the number of bytes.
Re: Scheduling Problem after Apply Configuration
Posted: Thu Jun 18, 2015 5:56 pm
by rajasegar
tmcdonald wrote:How many hosts and services do you have total? When you apply configuration there is a lot being written to the NDO database, and when we see problems is is usually the number of messages that shoots up, not the number of bytes.
Hosts : 2837
Services : 20854
The scheduling graphs always drops to single bar about 10 max. Need to restart a few times.
When this proble, happens, the messages queue always shows 90k plus messages.
All server performance CPU, Memory, I/O, disk storage all ok.
Re: Scheduling Problem after Apply Configuration
Posted: Fri Jun 19, 2015 9:14 am
by tgriep
This may be a possible solution you could try to see if this fixes the Message Queue problem.
Please try it on a test system first if you can.
Open /etc/sysctl.conf with a text editor. Edit the file to match the following values:
# Controls the maximum size of a message, in bytes
kernel.msgmnb = 262144000
# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 262144000
## The maximum number of messages allowed in any one message queue
kernel.msgmni = 512000
Note: If you don't have these entries in the "/etc/sysctl.conf" file, just add them to the end of the file.
After these settings are saved to the file, run:
sysctl -p
Then edit /etc/my.cnf and add the following.
[mysqld]
#InnoDB Settings
innodb_buffer_pool_size=3G
innodb_flush_log_at_trx_commit=2
#Other
max_connections=500
max_connect-errors=1000
skip-name-resolve
Re: Scheduling Problem after Apply Configuration
Posted: Sun Jun 21, 2015 8:10 am
by rajasegar
tgriep wrote:This may be a possible solution you could try to see if this fixes the Message Queue problem.
Please try it on a test system first if you can.
Open /etc/sysctl.conf with a text editor. Edit the file to match the following values:
# Controls the maximum size of a message, in bytes
kernel.msgmnb = 262144000
# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 262144000
## The maximum number of messages allowed in any one message queue
kernel.msgmni = 512000
Note: If you don't have these entries in the "/etc/sysctl.conf" file, just add them to the end of the file.
After these settings are saved to the file, run:
sysctl -p
Then edit /etc/my.cnf and add the following.
[mysqld]
#InnoDB Settings
innodb_buffer_pool_size=3G
innodb_flush_log_at_trx_commit=2
#Other
max_connections=500
max_connect-errors=1000
skip-name-resolve
Thanks. Will update once I have tested this out.
Re: Scheduling Problem after Apply Configuration
Posted: Mon Jun 22, 2015 9:49 am
by tgriep
No problem, keep us in the loop.
Re: Scheduling Problem after Apply Configuration
Posted: Tue Jun 23, 2015 6:46 pm
by rajasegar
No positive changes noted and no error messages anywhere.
Messages still start from 100k during Apply Configuration.
Only message noted in the /var/log/messages was this repeated every minute
Code: Select all
Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming systemcommands.
Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming servicechecks.
Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming hostchecks.
Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming eventhandlers.
After 30 minutes there was still 66k messages in the queue.
Code: Select all
------ Message Queues --------
key msqid owner perms used-bytes messages
0x41000002 3637248 nagios 600 55632620 65918
The scheduling was hovering wildly between 200 - 1000 and finally settled on about 600 when it used to be 1000++ and a more distributed pattern
2015-06-24_07-41-54.png
Going to revert back all changes in the db configurations later.
Re: Scheduling Problem after Apply Configuration
Posted: Wed Jun 24, 2015 8:53 am
by tgriep
Thanks for trying this out. The changes worked for another customer but every system is a little different.
We are still looking at the issue and will report back when there is a fix available.