Distributed Monitoring with Windows Environments
-
- Posts: 32
- Joined: Tue Aug 15, 2017 1:20 am
Distributed Monitoring with Windows Environments
Hello everyone,
i'm in search for a solution you might be able to help me with. At our Company we are running Nagios XI. Our primary usecase is not monitoring our Internal infrastructure, but that of our customers. So far so good. We are currently monitoring about 650 Hosts and 2500 Services across 90 Customers. That's all working great.
Since we are using mostly NCPA with active checks this also means we are maintaining 90 VPN Tunnels for the checks to work. While this is indeed working nice, it imposes some inconveniences.
First, when the VPN Tunnel goes down or restarts, all Hosts and Services from that customer are reported Down (ofc). Secondly, Maintaining 90 VPN Tunnels on our Firewall isn't exactly Perfect either.
To mitigate those situations i'd ideally want to have several hosts at our customers site that run the checks and report the results over one single public IP to us. Since 99% of client servers are windows, i'd ideally want something that runs on windows (otherwise i'd have to set up Linux VM's for that).
Any tipps on what to look at for getting to main monitoring to our customers sites?
i'm in search for a solution you might be able to help me with. At our Company we are running Nagios XI. Our primary usecase is not monitoring our Internal infrastructure, but that of our customers. So far so good. We are currently monitoring about 650 Hosts and 2500 Services across 90 Customers. That's all working great.
Since we are using mostly NCPA with active checks this also means we are maintaining 90 VPN Tunnels for the checks to work. While this is indeed working nice, it imposes some inconveniences.
First, when the VPN Tunnel goes down or restarts, all Hosts and Services from that customer are reported Down (ofc). Secondly, Maintaining 90 VPN Tunnels on our Firewall isn't exactly Perfect either.
To mitigate those situations i'd ideally want to have several hosts at our customers site that run the checks and report the results over one single public IP to us. Since 99% of client servers are windows, i'd ideally want something that runs on windows (otherwise i'd have to set up Linux VM's for that).
Any tipps on what to look at for getting to main monitoring to our customers sites?
Re: Distributed Monitoring with Windows Environments
If you are able to do active checks already, then use passive checks with NCPA using NRDP.To mitigate those situations i'd ideally want to have several hosts at our customers site that run the checks and report the results over one single public IP to us
XI has NRDP, just make sure to pass the URL and token to the NCPA.cfg file
Each host (or customer windows pc) can have a specific hostname, but they will need your XI NRDP URL and token.
The handler has to be set to nrdp as well.
Here's our documentation on setting up passive checks. You can create passive checks using the NCPA GUI.
https://assets.nagios.com/downloads/ncp ... Checks.pdf
Let me know if this works for you.
-
- Posts: 32
- Joined: Tue Aug 15, 2017 1:20 am
Re: Distributed Monitoring with Windows Environments
Yeah, i'm currently trying Passive Checks out. They might be an option. They bring some inconveniences though. Not being able to change warning and critical values from Nagios is a bummer. Also, currently, we use on Service for all windows CPU Checks with all relevant hosts added to that. Going passive will inflate the number of services massively.
None the less it might be an option. On ething i cant figure out:
I'm configuring the checks. All is great but i can't get Spaces to work. The LAN Interface for example is called "Lan-Verbindung 3". I have not found a way to get NCPA to process the space. The same Problem occurs with many Performance counters. Is there a substitution in the windows nrdp.cfg for a space?
None the less it might be an option. On ething i cant figure out:
I'm configuring the checks. All is great but i can't get Spaces to work. The LAN Interface for example is called "Lan-Verbindung 3". I have not found a way to get NCPA to process the space. The same Problem occurs with many Performance counters. Is there a substitution in the windows nrdp.cfg for a space?
Re: Distributed Monitoring with Windows Environments
Yes, passive checks are predefined with the warning and critical levels.
Any examples of what NCPA is actually processing would help also.
Thanks!
Could you provide me with the command you are using to check this? Also the list of Performance counters that have the same issue with spacing?I'm configuring the checks. All is great but i can't get Spaces to work. The LAN Interface for example is called "Lan-Verbindung 3". I have not found a way to get NCPA to process the space. The same Problem occurs with many Performance counters. Is there a substitution in the windows nrdp.cfg for a space?
Any examples of what NCPA is actually processing would help also.
Thanks!
-
- Posts: 32
- Joined: Tue Aug 15, 2017 1:20 am
Re: Distributed Monitoring with Windows Environments
Sure. The Network Check is defined in NRDP.cfg as follows:kyang wrote:Yes, passive checks are predefined with the warning and critical levels.
Could you provide me with the command you are using to check this? Also the list of Performance counters that have the same issue with spacing?
Any examples of what NCPA is actually processing would help also.
Thanks!
Code: Select all
%HOSTNAME%|Network Usage = interface/LAN-Verbindung 3/bytes_recv
Code: Select all
WARNING:ncpacheck:Unable to parse all arguments from instruction. Mis-paired option: 3/bytes_recv
Code: Select all
MSExchangeTransport SMTPReceive(_total)\Messages Received/sec
Re: Distributed Monitoring with Windows Environments
This is in your NRDP.cfg? Or do you mean your NCPA.cfg?
Are you grabbing this network check from the NCPA GUI?
This is what my passive check refers to.
Well, actually what version of NCPA are you on?
Are you grabbing this network check from the NCPA GUI?
This is what my passive check refers to.
Code: Select all
%HOSTNAME%|<service name> = /interface/Local Area Connection/bytes_recv --warning 1000 --critical 2000
-
- Posts: 32
- Joined: Tue Aug 15, 2017 1:20 am
Re: Distributed Monitoring with Windows Environments
Ofc in my nrdp.cfg. That's where Services are defined according to the document here: https://assets.nagios.com/downloads/ncp ... Checks.pdfkyang wrote:This is in your NRDP.cfg? Or do you mean your NCPA.cfg?
Are you grabbing this network check from the NCPA GUI?
This is what my passive check refers to.
Well, actually what version of NCPA are you on?Code: Select all
%HOSTNAME%|<service name> = /interface/Local Area Connection/bytes_recv --warning 1000 --critical 2000
No, i'm not getting those from the GUI. The CPU, RAM and HDD Checks where created on NCPA Installation. The Rest i'm doing by hand.
I saw that i missed the initial "/". I added that and that stopped the errors in the passive.log. Nagios now Says
NCPA Version is 2.1.1The node (LAN-Verbindung) requested does not exist. You may be trying to access the 'LAN-Verbindung 3' node.
Edit: Thinking of it, the difference seems to be, that my Name contains a number and yours doesn't. Maybe NCPA processes Numbers after spaces as Values to the Argument before it (like --warning 1000)?
Re: Distributed Monitoring with Windows Environments
Ah okay, I normally just define checks in my ncpa.cfg since the NRDP connection settings are in there, either way works.
Let me know what you find out. The GUI could be of help, as it finds all nodes available for being checked essentially.
You should try using the GUI. It can be of big help when finding out the correct definitions for passive checks and or active checks.The node (LAN-Verbindung) requested does not exist. You may be trying to access the 'LAN-Verbindung 3' node.
I'll have to test this out and see for myself.Thinking of it, the difference seems to be, that my Name contains a number and yours doesn't. Maybe NCPA processes Numbers after spaces as Values to the Argument before it (like --warning 1000)?
Let me know what you find out. The GUI could be of help, as it finds all nodes available for being checked essentially.
-
- Posts: 32
- Joined: Tue Aug 15, 2017 1:20 am
Re: Distributed Monitoring with Windows Environments
So, i gave the web GUI a try and quite like it. I created the command through the API Tab and it gave me
I pasted that into my nrdp.cfg and am getting the same Error
I also tried it withoutthe "3" in the name and have the space in a different position, but that doesn't change it. I have no clue to be hones
Code: Select all
%HOSTNAME%|<service name> = /interface/LAN Verbindung/bytes_recv --warning 20 --critical 30 --units Gi
Code: Select all
The node (LAN) requested does not exist. You may be trying to access the 'LAN Verbindung' node.
Re: Distributed Monitoring with Windows Environments
Yes, the GUI is pretty awesome.
Hmm, but that is interesting... When you added the passive check into the nrdp.cfg. Did you restart ncpa_passive service?
Can you PM or post your nrdp.cfg file for me?
Do you currently have NCPA installed on all of those Windows machines you are trying to get passive checks from?
Hmm, but that is interesting... When you added the passive check into the nrdp.cfg. Did you restart ncpa_passive service?
Can you PM or post your nrdp.cfg file for me?
Do you currently have NCPA installed on all of those Windows machines you are trying to get passive checks from?