Nagios Task Scheduling

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Nagios Task Scheduling

Post by rajasegar »

Nagios XI 2014R1.2
Offloaded DB
RHEL 6.5 x64
Manual Install
Firefox 30
30-06-2014 01-09-33 PM.png
Can you please advice on the graph pattern above?
It used to be like wave with everything below 700, now the leftmost column is always very high.

We are not using Modgearman.

Thanks
You do not have the required permissions to view the files attached to this post.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: Nagios Task Scheduling

Post by WillemDH »

Hello,

We had a similar problem some time ago after a vmotion of the Nagios XI server to another datacenter.

See this thread for the solution proposed by Nagios support team: http://support.nagios.com/forum/viewtop ... 16&t=27351

Personally I did not have remove autoretention file, the problem slowly solved itself after a few days.

Grtz

Willem
Nagios XI 5.8.1
https://outsideit.net
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Nagios Task Scheduling

Post by rajasegar »

WillemDH wrote:Hello,

We had a similar problem some time ago after a vmotion of the Nagios XI server to another datacenter.

See this thread for the solution proposed by Nagios support team: http://support.nagios.com/forum/viewtop ... 16&t=27351

Personally I did not have remove autoretention file, the problem slowly solved itself after a few days.

Grtz

Willem

Thanks. Removed the retention.dat and restarted.
It was ok a few seconds then back to the same problem.
Load is ok, CPU ok, Memory ok. What else could be the problem?

I also noticed the 1 minute active checks is 0 now for hosts and services, it used to be around 500 - 600.

30-06-2014 03-53-10 PM.png
You do not have the required permissions to view the files attached to this post.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Nagios Task Scheduling

Post by rajasegar »

Installed mod gearman and same problem.

Increased worker from 50 to 100. Problem went away.
30-06-2014 06-17-21 PM.png
2014-06-30 18:22:02 - localhost:4730 - v0.25

Code: Select all

 Queue Name           | Worker Available | Jobs Waiting | Jobs Running
-----------------------------------------------------------------------
 check_results        |               1  |           0  |           0
 eventhandler         |             100  |           0  |           0
 host                 |             100  |           0  |           4
 service              |             100  |           0  |          63
 worker_nagiosprodxi1 |               1  |           0  |           0
-----------------------------------------------------------------------

However my performance data files is not updating.
Already made the changes
Change process-host-perfdata-file-bulk command's to:

Code: Select all

 define command {
       command_name                             process-host-perfdata-file-bulk
       command_line                             sed -i 's/\\n//g' /usr/local/nagios/var/host-perfdata && /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.host
}

define command {
       command_name                             process-service-perfdata-file-bulk
       command_line                             sed -i 's/\\n//g' /usr/local/nagios/var/service-perfdata && /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.service
}

Code: Select all

[nagios@nagiosprodxi1 mod_gearman]$ tail mod_gearman_worker.log
[2014-06-30 18:02:56][20693][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:57][18900][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:57][19127][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:57][21006][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:57][19889][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:57][20467][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:57][19670][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:57][18456][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:02:58][17857][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498
[2014-06-30 18:03:04][2359][INFO ] mod_gearman worker daemon started with pid 2359
[nagios@nagiosprodxi1 mod_gearman]$

Code: Select all

[nagios@nagiosprodxi1 mod_gearman]$ tail mod_gearman_neb.log
[2014-06-30 18:02:53][32692][ERROR] sending job to gearmand failed: flush(GEARMAN_COULD_NOT_CONNECT) localhost:4730 -> libgearman/connection.cc:498

Code: Select all

[nagios@nagiosprodxi1 mod_gearman]$ nc -z localhost 4730
Connection to localhost 4730 port [tcp/gearman] succeeded!

[nagios@nagiosprodxi1 mod_gearman]$ sudo service gearmand status
gearmand (pid  2315) is running...
[nagios@nagiosprodxi1 mod_gearman]$ sudo service mod_gearman_worker status
mod_gearman_worker is running with pid 2359
[nagios@nagiosprodxi1 mod_gearman]$
Anybody can help out? Still troubleshooting.
You do not have the required permissions to view the files attached to this post.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Nagios Task Scheduling

Post by rajasegar »

This is a strange day. I am posting solution to my own post.

Code: Select all

[nagios@nagiosprodxi1 raja]$ sudo tail -f /var/log/messages
Jun 30 19:05:23 nagiosprodxi1 nagios: Warning: Attempting to execute the command "sed -i 's/\\n//g' /usr/local/nagios/var/host-perfdata &amp" resulted in a return code of 127.  Make sure the script or binary you are trying to execute actually exists...
Jun 30 19:05:24 nagiosprodxi1 nagios: Warning: Attempting to execute the command "sed -i 's/\\n//g' /usr/local/nagios/var/service-perfdata &amp" resulted in a return code of 127.  Make sure the script or binary you are trying to execute actually exists...
It appears that when I pasted the command from Excel to CCM the && got changed to && and this screwed up the performance data update. When I changed it back to && from command line and restarted. It was fine again.

Is this a bug?
I just tested with a new test command and put two &&, and this ended up in the config file

Code: Select all

define command {
       command_name                             test
       command_line                             &&
}
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Nagios Task Scheduling

Post by slansing »

Why were you trying to copy it from excel? Did you try actually adding "&&" directly from your keys to the field in the CCM? When you save it, does it still show &amp&amp? I'm not sure how you changed the commands.cfg manually from the command line as nagios should not accept any manual config changes, since when they get written out during the apply config process it takes the values from the mysql db and would overwrite your manual changes, hence the warning message above all configurations created via the ccm.

Edit: just confirmed this as a bug in 2014 r1.2, making a report right now.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Nagios Task Scheduling

Post by rajasegar »

slansing wrote:Why were you trying to copy it from excel? Did you try actually adding "&&" directly from your keys to the field in the CCM? When you save it, does it still show &amp&amp? I'm not sure how you changed the commands.cfg manually from the command line as nagios should not accept any manual config changes, since when they get written out during the apply config process it takes the values from the mysql db and would overwrite your manual changes, hence the warning message above all configurations created via the ccm.

Edit: just confirmed this as a bug in 2014 r1.2, making a report right now.
Well I kept my notes in Excel. Anyway it was not Excel fault as when I typed in manually it still gets converted.
Funny thing is in devt it was fine.

Please note the warning message was around 19:00 and I solved the problem around 19:20 so it was the old log message.

Anyway after I changed via command line and applied changes, it worked, dont ask me how.
I will double check if the changes is still there when get to work. If it is not there will try the " ".
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Nagios Task Scheduling

Post by slansing »

Cool, yeah we are looking into it right now, as soon as you apply config that manual change you made will be reverted just an FYI.
rajasegar
Posts: 1018
Joined: Sun Mar 30, 2014 10:49 pm

Re: Nagios Task Scheduling

Post by rajasegar »

slansing wrote:Cool, yeah we are looking into it right now, as soon as you apply config that manual change you made will be reverted just an FYI.
Applied the ccm.zip patch and there is no more $amp;&
Hopefully this is the last of the CCM issues. Thanks

edit: The /n and \n are still there after applying the patch
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Nagios Task Scheduling

Post by slansing »

Okay, that sed is only to remove newlines in your perfdata, which it should take care of, are you running 32-bit or 64-bit? I don't recall.
Locked