Page 1 of 1

Nagios distributed Monitoring

Posted: Mon Feb 10, 2020 2:07 pm
hi team,

can anyone help me for Nagios distributed Monitoring
i am using nagiosxi with 1000+ hosts with 30000 + services which is cosign very slowness in nagios xi

Re: Nagios distributed Monitoring

Posted: Mon Feb 10, 2020 5:29 pm
by mbellerue
Have you read through the Integrating Mod-Gearman With Nagios XI document? This should be everything you need to get some of those service checks off-loaded.
https://assets.nagios.com/downloads/nag ... ios_XI.pdf

Another approach would be to move some of the service checks to passive checks. Do you know if you're using any passive checks right now, or if it's all active checks? There's a lot to read up on Active vs Passive checks, but it can be worth it in situations like this where you have tens of thousands of checks.
https://assets.nagios.com/downloads/nag ... hecks.html
https://assets.nagios.com/downloads/nag ... hecks.html
https://assets.nagios.com/downloads/ncp ... Checks.pdf
https://assets.nagios.com/downloads/nag ... ios-XI.pdf

One last thing is to offload the MySQL/MariaDB database. Though if this is a physical server, you might just put the databases on a separate disk array.
https://assets.nagios.com/downloads/nag ... Server.pdf

Re: Nagios distributed Monitoring

Posted: Tue Feb 11, 2020 10:32 pm
we are using active checks.

Re: Nagios distributed Monitoring

Posted: Tue Feb 11, 2020 10:36 pm
i have tried to install mod gearman using below document,

https://assets.nagios.com/downloads/nag ... ios_XI.pdf

i am getting below error in log file.

ERROR 2020-01-05 18:02:24.000000 [ 1 ] lost connection to client recv(EPIPE || ECONNRESET || EHOSTDOWN)(Connection reset by peer) -> libgearman-server/io.cc:100
ERROR 2020-01-05 18:02:24.000000 [ 1 ] closing connection due to previous errno error -> libgearman-server/io.cc:109

Re: Nagios distributed Monitoring

Posted: Wed Feb 12, 2020 12:58 pm
by jdunitz
Can you check to be sure that the date and time on your server is correct? You're running ntp to keep your clock synced, yes?

I ask this because the date in your example is from over a month ago, and some people have had problems in the past with Gearman and incorrect dates leading to unexpectedly-closed connections and errors very similar to yours.

Have a look at that and let us know if it helps.

--Jeffrey

Re: Nagios distributed Monitoring

Posted: Tue Feb 18, 2020 10:39 pm
after installing mod gearman i have restarted nagios and getting below error.

Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13473;pid=13473
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13475;pid=13475
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13474;pid=13474
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: wproc: Registry request: name=Core Worker 13477;pid=13477
Feb 19 14:44:12 nagios-mod-gearman nagios[13471]: Error: Function nebmodule_init() in module '/usr/lib64/mod_gearman/mod_gearman_nagios4.o' returned an error. Module will be unloaded.
Feb 19 14:44:12 nagios-mod-gearman systemd[1]: nagios.service: main process exited, code=exited, status=1/FAILURE
Feb 19 14:44:12 nagios-mod-gearman kill[13481]: kill: cannot find process ""
Feb 19 14:44:12 nagios-mod-gearman systemd[1]: nagios.service: control process exited, code=exited status=1
Feb 19 14:44:14 nagios-mod-gearman systemd[1]: Unit nagios.service entered failed state.
Feb 19 14:44:14 nagios-mod-gearman systemd[1]: nagios.service failed.

Re: Nagios distributed Monitoring

Posted: Wed Feb 19, 2020 2:35 pm
by tgriep
Don't worry about the incorrect time stamp in the Gearman log file, it is a known issue and it can be ignored.

Please restart the nagios process on the server to generate the error.
Then get the following files and upload them to the ticket.

Code: Select all

/usr/local/nagios/etc/nagios.cfg
/usr/local/nagios/var/nagios.log

Also, run the following as root and post the results.

Code: Select all

ls -l /usr/lib64/mod_gearman/

Re: Nagios distributed Monitoring

Posted: Wed Feb 19, 2020 9:38 pm
please find the details .

-rw-r--r-- 1 root root 597520 Dec 1 2018 mod_gearman_naemon.o
-rw-r--r-- 1 root root 529728 Dec 1 2018 mod_gearman_nagios3.o
-rw-r--r-- 1 root root 535280 Dec 1 2018 mod_gearman_nagios4.o

Re: Nagios distributed Monitoring

Posted: Thu Feb 20, 2020 10:11 am
by tgriep
Change the Gearman broker line to the following in the /usr/local/nagios/etc/nagios.cfg file.

Code: Select all

broker_module=/usr/lib64/mod_gearman/mod_gearman_nagios4.o config=/etc/mod_gearman/module.conf eventhandler=no
Save it and restart nagios to see if it starts and runs.

If not, upload the nagios.log file again.