RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
transcom
Posts: 11
Joined: Thu Jun 21, 2018 9:31 am

RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by transcom »

Hi All!

I have a custom command that returns a number of perf counter. Number of counters is dynamic, therefore I configured RRD_STORAGE_TYPE for the command to MULTIPLE. However, there is no graph for the service anymore.
No performance graphs were found for this service.
The RRD files a generated per counter now. The service returns valid perormance data, according to the Advanced page, and perfmon.log shows that perf data is processed.
Found Performance Data for HOST
I suspect that te issue is with the default Graph Template, that is not capable of interpreting multiple DataSources. But how do I troubleshoot this?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by scottwilkerson »

transcom wrote:I suspect that te issue is with the default Graph Template, that is not capable of interpreting multiple DataSources.
They are capable of having multiple datasources but the amount of datasources can never change once create.

you need to send the same amount of datasources as when the rrd was created or it will not update.

RRD files are a fixed size from creation and cannot change to a dynamic amount of datasources
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
transcom
Posts: 11
Joined: Thu Jun 21, 2018 9:31 am

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by transcom »

scottwilkerson wrote: They are capable of having multiple datasources but the amount of datasources can never change once create.

you need to send the same amount of datasources as when the rrd was created or it will not update.

RRD files are a fixed size from creation and cannot change to a dynamic amount of datasources
With RRD_STORAGE_TYPE set to MULTIPLE, i have one RRD file per counter, NOT one RRD file per service. There are multiple RRD files with name 'service_counter1.rrd', 'service_counter2.rrc', etc. Each RRD file contains data from one single datasource. The Graph Template should be able to walk all RRD files associated with the service (they are in the XML file) and draw the graphs.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by scottwilkerson »

transcom wrote:With RRD_STORAGE_TYPE set to MULTIPLE, i have one RRD file per counter, NOT one RRD file per service.
I understand this, but they CANNOT change after the RRD is created.

The only way to do it is through a long drawn out process, but not on the fly
https://stackoverflow.com/questions/134 ... isting-rrd

I know this very well as this was a major issue when a standard ping plugin was updated to return 5 datasources instead of 2. It was what created us to make these instructions
https://support.nagios.com/kb/article.php?id=149

The script linked above would actually fix your problem, however only until your plugin returns more datasources than it does now
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
transcom
Posts: 11
Joined: Thu Jun 21, 2018 9:31 am

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by transcom »

The original RRD and XML files were deleted while the service was temporarily disabled. The current files are all newly generated.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by scottwilkerson »

But is the plugin returning different amount of datasources that it did the first time it was created?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
transcom
Posts: 11
Joined: Thu Jun 21, 2018 9:31 am

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by transcom »

Yes, it does. Is this saved in somewhere else, other than in the original RRD file?

Goal is to have the performance counters (and graphs, obviuosly) dynamic. Add new counters automatically when perf data with new counter name is received.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by scottwilkerson »

transcom wrote:Yes, it does. Is this saved in somewhere else, other than in the original RRD file?

Goal is to have the performance counters (and graphs, obviuosly) dynamic. Add new counters automatically when perf data with new counter name is received.
No, just the RRD.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
transcom
Posts: 11
Joined: Thu Jun 21, 2018 9:31 am

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by transcom »

OK, just to sum up all details I can think of (previous messages were sent from mobile, so they were cut short)!

- New Nagios XI system is installed, and some custom services were developed, including some, that have performance counters are that potentially extended or reduced in number, as output of the service is generated dynamically, based on some external parameters.

- As I learned later, the RRD file for the service was created in RRD_STORAGE_TYPE = MULTIPLE = SIMPLE mode, as this is the system default for Nagios. I also learned that (as Scott wrote above) the datastores of an RRD file are hardcoded when the file is created, and can not be extended afterwards.

- Service and the corresponding graph of the performance counters worked correctly, until the time when the number of perf counters were first changed (extended from 4 to 6 counters). Service output and alerting is still OK, but the graph started to run amok, replacing counter names with each other when displaying historical data from the previous days. As I discovered, this was because in the RRD file, datasources were not named, but numbered (1, 2, 3, 4), and when the new 2 counters appeared, they were returned first and second.

- After some reading and investigation, I realized that setting RRD_STORAGE_TYPE = MULTIPLE for the corresponding service is the (possible) solution. So I created the custom "command name.cfg" file under /usr/local/nagios/etc/pnp/check_commands folder for the command used by my service, then disabled the service temporarily, and removed the "service_name.RRD" and "service_name.XML" files from the perfdata/"hostname" folder. As this is a new Nagios installation, the few days of historical data in the RD file was not a big loss.

- After enabling the service again, a single "service_name.XML" file and 6 "service_name-counter_name.RRD" files (one for every counter in the service) was created. The XML contained reference for all 6 RRD file, and I thought "so far, so good". Then I realized, that there is no peformance graph for the service anymore. Else, there is only the "No performance graphs were found for this service".

- I checked that on the Advanced tab of the service, that performance data is returned by the service, and recognized by Nagios.

- I enabled logging for performance, and determined that for the service in question (and of course, for the rest of the services too) performance data processing is happening, there are no error messages ("Found Performance Data for HOST - Service", with the actual performance counter values).

- After continued reading and researching, I concluded that the most probable cause for the graph issue is that the graph template that was used for the service (the default template of Nagios) is not prepared for multiple RRD files, is not able to parse all the RRD files of the service, and therefore thinks that there is no RRD file at all, and gives the No performance graphs were found for this service" error.

- I tried to find an updated graph template that is coded to support multiple RRD files and RRD_STORAGE_TYPE = MULTIPLE, and I managed to find some, but each of those were created and tailored for a specific command, and as I am not an experienced PHP coder, I could not make use of them.

This is where I am standing now, and I hope that someone can help or come up with a solution (i.e. a default graph template that is updated to support multiple RRD files/RRD_STORAGE_TYPE = MULTIPLE, or else if it is not the graph template that is causing the issue). Dynamic services and dynamic performance counters are a very good thing, they reduce the need of manual Nagios administration, and allow more flexible monitoring of the infrastructure.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: RRD_STORAGE_TYPE = MULTIPLE, ho graphs anymore

Post by scottwilkerson »

Well this is a lot more information, and helpful for me to understand what was going on.
transcom wrote:- After some reading and investigation, I realized that setting RRD_STORAGE_TYPE = MULTIPLE for the corresponding service is the (possible) solution. So I created the custom "command name.cfg" file under /usr/local/nagios/etc/pnp/check_commands folder for the command used by my service, then disabled the service temporarily, and removed the "service_name.RRD" and "service_name.XML" files from the perfdata/"hostname" folder. As this is a new Nagios installation, the few days of historical data in the RD file was not a big loss.

- After enabling the service again, a single "service_name.XML" file and 6 "service_name-counter_name.RRD" files (one for every counter in the service) was created. The XML contained reference for all 6 RRD file, and I thought "so far, so good". Then I realized, that there is no peformance graph for the service anymore. Else, there is only the "No performance graphs were found for this service".
In Nagios XI we have never used this mode where there are multiple RRD files per service and as such Nagios XI EXPECTS there to be just 1 RRD file per service.
transcom wrote:This is where I am standing now, and I hope that someone can help or come up with a solution (i.e. a default graph template that is updated to support multiple RRD files/RRD_STORAGE_TYPE = MULTIPLE, or else if it is not the graph template that is causing the issue). Dynamic services and dynamic performance counters are a very good thing, they reduce the need of manual Nagios administration, and allow more flexible monitoring of the infrastructure.
While I agree, it simply is not available currently.

And I would put in a feature request for you however, before the system would be rewritten to accommodate this we will already be moving a different direction to accommodate dynamic amount of datasources and moving away from RRD files all together.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked