Nagios Support Forum

Posted: **Thu Jan 14, 2016 7:32 pm**

Sorry, the host is the same, I just changed the hostname differently in the post...
I corrected it in the previous post.
We can forget the new service I created to simplify things, but it has the same issue...

BTW, after looking some more, it looks like the trouble isn't related to the copied service, but to the check in general. The RRD/XML files were pretty old, as are a lot of them. It's slowly dawning on me that the little graph icon thing that tells me I have graph info is not telling me I have good data... It seems that more than half of the rrd files are out of date just on this service.

# locate FS_Win_Usage.rrd | xargs ls -l | grep -v Jan | wc -l
22
# locate FS_Win_Usage.rrd | xargs ls -l | wc -l
46

Still, even though the thread title is wrong, the underlying issue remains. No perfdata from this service. The outdated rrd/xml files have been removed for the server I'm using as an example.

Hopefully some of this makes sense.

Posted: **Thu Jan 14, 2016 9:54 pm**

I'm pretty sure it's NSClient++ that is the issue here.

You don't have MinCritFree= defined and that is causing the performance data to not display (I'm with you, sometimes you only want warning and not critical with certain services).

Here is your command:

Code: Select all

./check_nrpe -H win2008r2-01 -u -t 30 -c CheckDriveSize -a ShowAll=long MinWarnFree=10%

WARNING warning(D:\: Total: 0B - Used: 0B (100%) - Free: 0B (0%)), : Total: 99.996MB - Used: 29.199MB (30%) - Free: 70.797MB (70%), E:\: Total: 39.997GB - Used: 91.004MB (1%) - Free: 39.908GB (99%), Q:\: Total: 24.997GB - Used: 90.531MB (1%) - Free: 24.909GB (99%), C:\Mounted Disks\A Mounted Disk\: Total: 17.997GB - Used: 90.313MB (1%) - Free: 17.909GB (99%), C:\: Total: 59.9GB - Used: 19.544GB (33%) - Free: 40.356GB (67%)

Here is with MinCritFree= added

Code: Select all

./check_nrpe -H win2008r2-01 -u -t 30 -c CheckDriveSize -a ShowAll=long MinWarnFree=10%  MinCritFree=5%

CRITICAL critical(D:\: Total: 0B - Used: 0B (100%) - Free: 0B (0%)), : Total: 99.996MB - Used: 29.199MB (30%) - Free: 70.797MB (70%), E:\: Total: 39.997GB - Used: 91.004MB (1%) - Free: 39.908GB (99%), Q:\: Total: 24.997GB - Used: 90.531MB (1%) - Free: 24.909GB (99%), C:\Mounted Disks\A Mounted Disk\: Total: 17.997GB - Used: 90.313MB (1%) - Free: 17.909GB (99%), C:\: Total: 59.9GB - Used: 19.544GB (33%) - Free: 40.356GB (67%)|'\\?\Volume{791a0772-737e-11e4-98ac-806e6f6e6963}\ free'=70.79687MB;9.9996;4.9998;0;99.99609 '\\?\Volume{791a0772-737e-11e4-98ac-806e6f6e6963}\ free %'=70%;9;4;0;100 'E:\ free'=39.90819GB;3.9997;1.99985;0;39.99706 'E:\ free %'=99%;9;4;0;100 'Q:\ free'=24.90865GB;2.4997;1.24985;0;24.99706 'Q:\ free %'=99%;9;4;0;100 'C:\Mounted Disks\A Mounted Disk\ free'=17.90887GB;1.7997;0.89985;0;17.99706 'C:\Mounted Disks\A Mounted Disk\ free %'=99%;9;4;0;100 'C:\ free'=40.35615GB;5.99003;2.99501;0;59.90038 'C:\ free %'=67%;9;4;0;100 'D:\ free'=0B;0;0;0;0

This appears to be a bug in NSClient++.

I also think it's related to the system volumes somehow:
\\?\Volume{791a0772-737e-11e4-98ac-806e6f6e6963}\

Are you able to log the bug here:
https://github.com/mickem/nscp

Posted: **Fri Jan 15, 2016 12:13 pm**

The MinCritFree is there, but I missed copying the wrapped part when reposting after the tr confusion. The system volumes exist on all servers so that's not the issue either.

# /usr/local/nagios/libexec/check_nrpe -H sqlhost-u -t 30 -c CheckDriveSize -a ShowAll=long MinWarnFree=10% MinCritFree=5% FilterType=fixed | tr , "\n"
OK C:\: Total: 299.901GB - Used: 158.707GB (53%) - Free: 141.195GB (47%)
D:\: Total: 99.996GB - Used: 51.638GB (52%) - Free: 48.358GB (48%)
G:\: Total: 2.441TB - Used: 233.248GB (10%) - Free: 2.214TB (90%)
J:\: Total: 10TB - Used: 3.982TB (40%) - Free: 6.018TB (60%)
L:\: Total: 4TB - Used: 328.096GB (9%) - Free: 3.679TB (91%)
M:\: Total: 4TB - Used: 397.326GB (10%) - Free: 3.612TB (90%)
N:\: Total: 13TB - Used: 6.564TB (51%) - Free: 6.436TB (49%)
O:\: Total: 12TB - Used: 4.226TB (36%) - Free: 7.774TB (64%)
R:\: Total: 1.75TB - Used: 508.166GB (29%) - Free: 1.254TB (71%)
T:\: Total: 1.065TB - Used: 481.642GB (45%) - Free: 608.595GB (55%)
W:\: Total: 1TB - Used: 782.761GB (77%) - Free: 241.112GB (23%)
X:\: Total: 10TB - Used: 2.027TB (21%) - Free: 7.973TB (79%)
: Total: 99.996MB - Used: 32.141MB (33%) - Free: 67.855MB (67%)

Posted: **Fri Jan 15, 2016 2:32 pm**

Could you post how the service check is defined on your XI server so we can review it?

Posted: **Fri Jan 15, 2016 2:50 pm**

The config file is in post 1. What else should I add?

Posted: **Fri Jan 15, 2016 3:24 pm**

Sorry, I missed that.
It could be that there is too much performance data getting returned and that could be it. Could you run the following and post the output here?

Code: Select all

echo 'describe nagios_servicestatus;' | mysql  -t -pnagiosxi nagios

Posted: **Fri Jan 15, 2016 3:31 pm**

Sure...
I also looked at using check_nt, but it seems to want a single drive letter, which would mean I'd need up to 12 services to check all drives. :(

Code: Select all

# echo 'describe nagios_servicestatus;' | mysql  -t -pnagiosxi nagios
+-------------------------------+---------------+------+-----+---------------------+----------------+
| Field                         | Type          | Null | Key | Default             | Extra          |
+-------------------------------+---------------+------+-----+---------------------+----------------+
| servicestatus_id              | int(11)       | NO   | PRI | NULL                | auto_increment |
| instance_id                   | smallint(6)   | NO   | MUL | 0                   |                |
| service_object_id             | int(11)       | NO   | UNI | 0                   |                |
| status_update_time            | datetime      | NO   | MUL | 0000-00-00 00:00:00 |                |
| output                        | varchar(1024) | NO   |     | NULL                |                |
| long_output                   | varchar(1024) | NO   |     | NULL                |                |
| perfdata                      | varchar(1024) | NO   |     | NULL                |                |
| current_state                 | smallint(6)   | NO   | MUL | 0                   |                |
| has_been_checked              | smallint(6)   | NO   |     | 0                   |                |
| should_be_scheduled           | smallint(6)   | NO   |     | 0                   |                |
| current_check_attempt         | smallint(6)   | NO   |     | 0                   |                |
| max_check_attempts            | smallint(6)   | NO   |     | 0                   |                |
| last_check                    | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| next_check                    | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| check_type                    | smallint(6)   | NO   | MUL | 0                   |                |
| last_state_change             | datetime      | NO   | MUL | 0000-00-00 00:00:00 |                |
| last_hard_state_change        | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| last_hard_state               | smallint(6)   | NO   |     | 0                   |                |
| last_time_ok                  | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| last_time_warning             | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| last_time_unknown             | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| last_time_critical            | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| state_type                    | smallint(6)   | NO   | MUL | 0                   |                |
| last_notification             | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| next_notification             | datetime      | NO   |     | 0000-00-00 00:00:00 |                |
| no_more_notifications         | smallint(6)   | NO   |     | 0                   |                |
| notifications_enabled         | smallint(6)   | NO   | MUL | 0                   |                |
| problem_has_been_acknowledged | smallint(6)   | NO   | MUL | 0                   |                |
| acknowledgement_type          | smallint(6)   | NO   |     | 0                   |                |
| current_notification_number   | smallint(6)   | NO   |     | 0                   |                |
| passive_checks_enabled        | smallint(6)   | NO   | MUL | 0                   |                |
| active_checks_enabled         | smallint(6)   | NO   | MUL | 0                   |                |
| event_handler_enabled         | smallint(6)   | NO   | MUL | 0                   |                |
| flap_detection_enabled        | smallint(6)   | NO   | MUL | 0                   |                |
| is_flapping                   | smallint(6)   | NO   | MUL | 0                   |                |
| percent_state_change          | double        | NO   | MUL | 0                   |                |
| latency                       | double        | NO   | MUL | 0                   |                |
| execution_time                | double        | NO   | MUL | 0                   |                |
| scheduled_downtime_depth      | smallint(6)   | NO   | MUL | 0                   |                |
| failure_prediction_enabled    | smallint(6)   | NO   |     | 0                   |                |
| process_performance_data      | smallint(6)   | NO   |     | 0                   |                |
| obsess_over_service           | smallint(6)   | NO   |     | 0                   |                |
| modified_service_attributes   | int(11)       | NO   |     | 0                   |                |
| event_handler                 | varchar(255)  | NO   |     |                     |                |
| check_command                 | varchar(255)  | NO   |     |                     |                |
| normal_check_interval         | double        | NO   |     | 0                   |                |
| retry_check_interval          | double        | NO   |     | 0                   |                |
| check_timeperiod_object_id    | int(11)       | NO   |     | 0                   |                |
+-------------------------------+---------------+------+-----+---------------------+----------------+

Posted: **Mon Jan 18, 2016 11:15 am**

Let's try and increase the MYSQL field size's and see if that helps out. Run the following as root on the XI server.

Code: Select all

echo "use nagios;alter table nagios_servicestatus modify output varchar(2048) not null;alter table nagios_servicestatus modify long_output varchar(2048) not null;alter table nagios_servicestatus modify perfdata varchar(2048) not null;" | mysql -pnagiosxi
echo "use nagios;alter table nagios_servicechecks modify output varchar(2048) not null;alter table nagios_servicechecks modify long_output varchar(2048) not null;alter table nagios_servicechecks modify perfdata varchar(2048) not null;" | mysql -pnagiosxi

After this in the XI GUI, go to Home > Services Details and select that service, go to the Advanced tab and see if the performance data filed has information in it.

Posted: **Mon Jan 18, 2016 11:22 am**

The size of the data is 845 bytes above so the size of the column isn't likely to be the issue.

Posted: **Mon Jan 18, 2016 3:21 pm**

It has to be a bug in NSClient++ then. If you can't get performance data to work on the command line, then the NSClient is the issue.

Nagios Support Forum

No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service

Re: No perfdata on a copied service