NagiosXI Licensing vs Monitoring Architecture
Posted: Mon Mar 07, 2016 1:35 pm
BASICS:
In our network environment we host installations of our software for about 100 customers. The software installation requires a SQL Server back-end and several HTTP/S front end services. SQL Server runs on it’s own hardware and is not an issue w/r/t this problem. The HTTP/S services for a handful of customers are hosted on a single, external-facing server. For example:
https://<Server1 FQDN>/Customer1-Service1
https://<Server1 FQDN>/Customer1-Service2
https://<Server1 FQDN>/Customer1-Service3
https://<Server1 FQDN>/Customer2-Service1
https://<Server1 FQDN>/Customer2-Service2
https://<Server1 FQDN>/Customer2-Service3
MONITORING:
We want to monitor each one of the HTTP/S services for each customer we host to ensure their uptime. We came up with 2 possible solutions:
1. Add the Host Server to Nagios. For each customer, add 3 service-monitors for the specific HTTP/S services to be monitored. When you’re looking at setting this up for 100+ customers, it doesn’t seem optimal and is also error prone since each time a new customer is added, we need to correctly set up 3 Service Monitors. Some of this can be alleviated by setting up appropriate Service Templates and Custom Commands, but it still seems a lot to do and more ways to fail.
2. A much better way, would be to set up a Host Group and assign it to 3 Custom Service Monitors which use parameterized Commands. The Commands reference MACRO variables to monitor an HTTP/S service for a specific customer (i.e. the value of the MACRO ’_CUSTSERVICE1’ gives the unique location of service one for a customer) . For each customer on ’Server1’, we then add a new Host Definition with a unique customer-focused name (e.g. Server1: CUST1) and the same IP address for ‘Server1’. When adding the host for the customer we define the values of the 3 MACROs which identify the 3 HTTP/S services to be monitored and add the host to the Host Group we created that is assigned to the 3 Service Monitors. The great this about this approach is that when a new customer comes on board, we simply add a new host making sure to define the 3 MACROs and assign it to our Host Group and we’re done. One task vs three. The other HUGE benefit is having a single Service Monitor for each of the 3 HTTP/S services which cover ALL customers. This allows us to tweak the monitoring specifics for ALL customers in a single place. Solution (1.) above would require visiting each service for each customer (a LOT of work although I would write and Emacs macro to make fairly short work of it, I cannot expect that from our other employees).
Originally, I implemented (1.) above. I sat with it , and as we tweaked things and ran into the pain it was to tweak so many different services, (2.) became clear. So we moved everything to reflect solution (2.) It works great, except for one problem; we are a fairly small company (about 60 unique hardware/ip’s to monitor with various services. As such, we purchased the “100 Node” license. Apparently, a ‘Node’ is really defined as a ‘Host.’ This means that implementing the more desirable solution to our monitoring problem, (2.), we quickly surpass the number of “Nodes” we are licensed for. This puts us in a difficult situation where the correct solution is being inhibited by the NagiosXI licensing strategy. No judgment on the NagiosXI licensing strategy, but in this case it’s not really working for us.
Has anyone had to solve a similar problem w/r/t the “Node” licenses? If so, how did you work around it while keeping a sane monitoring strategy? Suggestions welcome.
Thanks in advance.
In our network environment we host installations of our software for about 100 customers. The software installation requires a SQL Server back-end and several HTTP/S front end services. SQL Server runs on it’s own hardware and is not an issue w/r/t this problem. The HTTP/S services for a handful of customers are hosted on a single, external-facing server. For example:
https://<Server1 FQDN>/Customer1-Service1
https://<Server1 FQDN>/Customer1-Service2
https://<Server1 FQDN>/Customer1-Service3
https://<Server1 FQDN>/Customer2-Service1
https://<Server1 FQDN>/Customer2-Service2
https://<Server1 FQDN>/Customer2-Service3
MONITORING:
We want to monitor each one of the HTTP/S services for each customer we host to ensure their uptime. We came up with 2 possible solutions:
1. Add the Host Server to Nagios. For each customer, add 3 service-monitors for the specific HTTP/S services to be monitored. When you’re looking at setting this up for 100+ customers, it doesn’t seem optimal and is also error prone since each time a new customer is added, we need to correctly set up 3 Service Monitors. Some of this can be alleviated by setting up appropriate Service Templates and Custom Commands, but it still seems a lot to do and more ways to fail.
2. A much better way, would be to set up a Host Group and assign it to 3 Custom Service Monitors which use parameterized Commands. The Commands reference MACRO variables to monitor an HTTP/S service for a specific customer (i.e. the value of the MACRO ’_CUSTSERVICE1’ gives the unique location of service one for a customer) . For each customer on ’Server1’, we then add a new Host Definition with a unique customer-focused name (e.g. Server1: CUST1) and the same IP address for ‘Server1’. When adding the host for the customer we define the values of the 3 MACROs which identify the 3 HTTP/S services to be monitored and add the host to the Host Group we created that is assigned to the 3 Service Monitors. The great this about this approach is that when a new customer comes on board, we simply add a new host making sure to define the 3 MACROs and assign it to our Host Group and we’re done. One task vs three. The other HUGE benefit is having a single Service Monitor for each of the 3 HTTP/S services which cover ALL customers. This allows us to tweak the monitoring specifics for ALL customers in a single place. Solution (1.) above would require visiting each service for each customer (a LOT of work although I would write and Emacs macro to make fairly short work of it, I cannot expect that from our other employees).
Originally, I implemented (1.) above. I sat with it , and as we tweaked things and ran into the pain it was to tweak so many different services, (2.) became clear. So we moved everything to reflect solution (2.) It works great, except for one problem; we are a fairly small company (about 60 unique hardware/ip’s to monitor with various services. As such, we purchased the “100 Node” license. Apparently, a ‘Node’ is really defined as a ‘Host.’ This means that implementing the more desirable solution to our monitoring problem, (2.), we quickly surpass the number of “Nodes” we are licensed for. This puts us in a difficult situation where the correct solution is being inhibited by the NagiosXI licensing strategy. No judgment on the NagiosXI licensing strategy, but in this case it’s not really working for us.
Has anyone had to solve a similar problem w/r/t the “Node” licenses? If so, how did you work around it while keeping a sane monitoring strategy? Suggestions welcome.
Thanks in advance.