Page 1 of 1

Nagios script to check Nagios

Posted: Wed Mar 09, 2022 12:57 am
by SalinaSJames
Hi

I am currently running multiple (6) Nagios instances across multiple projects. My concept is to run one Nagios instance per project, each responsible for the machines below them and with project specific tasks.

Yes, I could roll all of these into one Nagios instance but this is not the question.

I have checked Google and Nagios Exchange looking for a plugin and if there is nothing out there I will build one myself. I want to know if anyone has any experience with this.

Question: Is there a Nagios plugin which will check the overall status of another remote Nagios instance, either through NRPE and a local script, or over authenticated HTTP(s) to the cgi-bin, simply reporting on how many are OK / Warning / Critical / Unknown, etc in each checked instance. HTTP(s) would be preferred.

If not, can someone point me in the direction of how to query and understand the responses of a single Nagios instance. If there is not any existing plugins I will start looking at Nagstamon for guidance on how to achieve this.

Re: Nagios script to check Nagios

Posted: Sun Mar 20, 2022 11:25 pm
by PhumeleleSJose96
Monitor everything you have time to configure to be monitored plus anything which is really critical.

On all of my Linux systems I monitor:

Current Load

Current Users

Disk Space

NTP Time

Ping

Swap

Total Processes

backup has run

crond running

mail queues

md raid

munin

ntpd running

postfix running

puppetd running

ssh

sshd running

And on particular machines I monitor such things as:

various websites checking that they are reachable, the page retrieved contains certain content, and it responds within a certain amount of time.

That certain processes or events have occurred within a certain amount of time

certain values such as the average of a certain number of fields in a database table within the last hour is within certain tolerances

and various and sundry other things that can happen (or NOT happen) as the case may be. It is pretty easy to write custom scripts to monitor whatever you want. I graph many thousands of items with munin in a similar way. I set it up once, write a puppet manifest for it, then puppet handles it on new machines from then on. It is pretty simple to deploy this stuff now. In just one of my nagios installations (I have a few) I have 80 hosts and 1120 services being monitored and that's nothing compared to what some people azar echatspin have.