Nagios distributed Monitoring

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
[email protected]
Posts: 66
Joined: Tue Aug 07, 2018 2:24 am

Nagios distributed Monitoring

Post by [email protected] »

hi team,

can anyone help me for Nagios distributed Monitoring
i am using nagiosxi with 1000+ hosts with 30000 + services which is cosign very slowness in nagios xi
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Nagios distributed Monitoring

Post by mbellerue »

Have you read through the Integrating Mod-Gearman With Nagios XI document? This should be everything you need to get some of those service checks off-loaded.
https://assets.nagios.com/downloads/nag ... ios_XI.pdf

Another approach would be to move some of the service checks to passive checks. Do you know if you're using any passive checks right now, or if it's all active checks? There's a lot to read up on Active vs Passive checks, but it can be worth it in situations like this where you have tens of thousands of checks.
https://assets.nagios.com/downloads/nag ... hecks.html
https://assets.nagios.com/downloads/nag ... hecks.html
https://assets.nagios.com/downloads/ncp ... Checks.pdf
https://assets.nagios.com/downloads/nag ... ios-XI.pdf

One last thing is to offload the MySQL/MariaDB database. Though if this is a physical server, you might just put the databases on a separate disk array.
https://assets.nagios.com/downloads/nag ... Server.pdf
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
[email protected]
Posts: 66
Joined: Tue Aug 07, 2018 2:24 am

Re: Nagios distributed Monitoring

Post by [email protected] »

we are using active checks.
[email protected]
Posts: 66
Joined: Tue Aug 07, 2018 2:24 am

Re: Nagios distributed Monitoring

Post by [email protected] »

i have tried to install mod gearman using below document,

https://assets.nagios.com/downloads/nag ... ios_XI.pdf

i am getting below error in log file.

ERROR 2020-01-05 18:02:24.000000 [ 1 ] lost connection to client recv(EPIPE || ECONNRESET || EHOSTDOWN)(Connection reset by peer) -> libgearman-server/io.cc:100
ERROR 2020-01-05 18:02:24.000000 [ 1 ] closing connection due to previous errno error -> libgearman-server/io.cc:109
User avatar
jdunitz
Posts: 235
Joined: Wed Feb 05, 2020 2:50 pm

Re: Nagios distributed Monitoring

Post by jdunitz »

Can you check to be sure that the date and time on your server is correct? You're running ntp to keep your clock synced, yes?

I ask this because the date in your example is from over a month ago, and some people have had problems in the past with Gearman and incorrect dates leading to unexpectedly-closed connections and errors very similar to yours.

Have a look at that and let us know if it helps.

--Jeffrey
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
[email protected]
Posts: 66
Joined: Tue Aug 07, 2018 2:24 am

Re: Nagios distributed Monitoring

Post by [email protected] »

after installing mod gearman i have restarted nagios and getting below error.

Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13473;pid=13473
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13475;pid=13475
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13474;pid=13474
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13477;pid=13477
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: Error: Function nebmodule_init() in module '/usr/lib64/mod_gearman/mod_gearman_nagios4.o' returned an error. Module will be unloaded.
Feb 19 14:44:12 nagios-mod-gearman systemd[1]: nagios.service: main process exited, code=exited, status=1/FAILURE
Feb 19 14:44:12 nagios-mod-gearman kill[13481]: kill: cannot find process ""
Feb 19 14:44:12 nagios-mod-gearman systemd[1]: nagios.service: control process exited, code=exited status=1
Feb 19 14:44:14 nagios-mod-gearman systemd[1]: Unit nagios.service entered failed state.
Feb 19 14:44:14 nagios-mod-gearman systemd[1]: nagios.service failed.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios distributed Monitoring

Post by tgriep »

Don't worry about the incorrect time stamp in the Gearman log file, it is a known issue and it can be ignored.

Please restart the nagios process on the server to generate the error.
Then get the following files and upload them to the ticket.

Code: Select all

/usr/local/nagios/etc/nagios.cfg
/usr/local/nagios/var/nagios.log

Also, run the following as root and post the results.

Code: Select all

ls -l /usr/lib64/mod_gearman/
Be sure to check out our Knowledgebase for helpful articles and solutions!
[email protected]
Posts: 66
Joined: Tue Aug 07, 2018 2:24 am

Re: Nagios distributed Monitoring

Post by [email protected] »

please find the details .

-rw-r--r-- 1 root root 597520 Dec 1 2018 mod_gearman_naemon.o
-rw-r--r-- 1 root root 529728 Dec 1 2018 mod_gearman_nagios3.o
-rw-r--r-- 1 root root 535280 Dec 1 2018 mod_gearman_nagios4.o
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Nagios distributed Monitoring

Post by tgriep »

Change the Gearman broker line to the following in the /usr/local/nagios/etc/nagios.cfg file.

Code: Select all

broker_module=/usr/lib64/mod_gearman/mod_gearman_nagios4.o config=/etc/mod_gearman/module.conf eventhandler=no
Save it and restart nagios to see if it starts and runs.

If not, upload the nagios.log file again.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked