Page 1 of 1

Scheduling Problem after Apply Configuration

Posted: Wed Jun 17, 2015 6:00 pm
by rajasegar
We are at a stage where every time we apply Configuration, the scheduling screws up.
Need to Apply a few times before it is ok.

The root cause seems to be ipcq -q shows messages queue to be more than 100k.

When it stabilises after a few hours it reaches to be a few hundred to thousand only

Code: Select all

[nagios@nagiosprodxi1 ~]$ ipcs -q

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages
0xbf000002 3440640    nagios     600        1914139      2342
Please advice when this issue is going to be fixed.

Re: Scheduling Problem after Apply Configuration

Posted: Thu Jun 18, 2015 1:39 pm
by tmcdonald
How many hosts and services do you have total? When you apply configuration there is a lot being written to the NDO database, and when we see problems is is usually the number of messages that shoots up, not the number of bytes.

Re: Scheduling Problem after Apply Configuration

Posted: Thu Jun 18, 2015 5:56 pm
by rajasegar
tmcdonald wrote:How many hosts and services do you have total? When you apply configuration there is a lot being written to the NDO database, and when we see problems is is usually the number of messages that shoots up, not the number of bytes.
Hosts : 2837
Services : 20854

The scheduling graphs always drops to single bar about 10 max. Need to restart a few times.
When this proble, happens, the messages queue always shows 90k plus messages.

All server performance CPU, Memory, I/O, disk storage all ok.

Re: Scheduling Problem after Apply Configuration

Posted: Fri Jun 19, 2015 9:14 am
by tgriep
This may be a possible solution you could try to see if this fixes the Message Queue problem.
Please try it on a test system first if you can.

Open /etc/sysctl.conf with a text editor. Edit the file to match the following values:
# Controls the maximum size of a message, in bytes
kernel.msgmnb = 262144000

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 262144000

## The maximum number of messages allowed in any one message queue
kernel.msgmni = 512000

Note: If you don't have these entries in the "/etc/sysctl.conf" file, just add them to the end of the file.
After these settings are saved to the file, run:
sysctl -p

Then edit /etc/my.cnf and add the following.
[mysqld]

#InnoDB Settings
innodb_buffer_pool_size=3G
innodb_flush_log_at_trx_commit=2

#Other
max_connections=500
max_connect-errors=1000
skip-name-resolve

Re: Scheduling Problem after Apply Configuration

Posted: Sun Jun 21, 2015 8:10 am
by rajasegar
tgriep wrote:This may be a possible solution you could try to see if this fixes the Message Queue problem.
Please try it on a test system first if you can.

Open /etc/sysctl.conf with a text editor. Edit the file to match the following values:
# Controls the maximum size of a message, in bytes
kernel.msgmnb = 262144000

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 262144000

## The maximum number of messages allowed in any one message queue
kernel.msgmni = 512000

Note: If you don't have these entries in the "/etc/sysctl.conf" file, just add them to the end of the file.
After these settings are saved to the file, run:
sysctl -p

Then edit /etc/my.cnf and add the following.
[mysqld]

#InnoDB Settings
innodb_buffer_pool_size=3G
innodb_flush_log_at_trx_commit=2

#Other
max_connections=500
max_connect-errors=1000
skip-name-resolve
Thanks. Will update once I have tested this out.

Re: Scheduling Problem after Apply Configuration

Posted: Mon Jun 22, 2015 9:49 am
by tgriep
No problem, keep us in the loop.

Re: Scheduling Problem after Apply Configuration

Posted: Tue Jun 23, 2015 6:46 pm
by rajasegar
No positive changes noted and no error messages anywhere.
Messages still start from 100k during Apply Configuration.

Only message noted in the /var/log/messages was this repeated every minute

Code: Select all

Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming systemcommands.
Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming servicechecks.
Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming hostchecks.
Jun 24 07:43:19 nagiosprodxi1 ndo2db: Trimming eventhandlers.
After 30 minutes there was still 66k messages in the queue.

Code: Select all

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages
0x41000002 3637248    nagios     600        55632620     65918

The scheduling was hovering wildly between 200 - 1000 and finally settled on about 600 when it used to be 1000++ and a more distributed pattern
2015-06-24_07-41-54.png
Going to revert back all changes in the db configurations later.

Re: Scheduling Problem after Apply Configuration

Posted: Wed Jun 24, 2015 8:53 am
by tgriep
Thanks for trying this out. The changes worked for another customer but every system is a little different.
We are still looking at the issue and will report back when there is a fix available.