Page 1 of 3
API failing with http code 500 on Servicestatus request
Posted: Wed Apr 05, 2017 4:03 pm
by grenley
Hi.
All of a sudden, on two of our production XI servers, the objects/servicestatus api is failing.
When called through the XI (Help) gui, nothing is returned.
When I call it through a script, I'm getting "500 Internal Server Error"
objects/hoststatus is working fine so it's not an api key issue or anything like that.
table corruption, perhaps?
What kind of info can I give you to help troubleshoot?
Thanks,
Rick
Re: API failing with http code 500 on Servicestatus request
Posted: Wed Apr 05, 2017 4:50 pm
by avandemore
The apache error log would be the first place to start. Usually that is located at /var/log/httpd/error_log.
Re: API failing with http code 500 on Servicestatus request
Posted: Wed Apr 05, 2017 5:35 pm
by grenley
yep...
[Wed Apr 05 18:33:52.314577 2017] [:error] [pid 82718] [client xxx.xx.xx.xxx:57309] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 2 bytes) in /opt/app/nagiosxi/html/backend/includes/xml2json.php on line 243
Re: API failing with http code 500 on Servicestatus request
Posted: Thu Apr 06, 2017 11:02 am
by mcapra
Can you try increasing your memory_limit in /etc/php.ini to roughly double what it is now? Be sure to do a service httpd restart afterwards to apply the changes.
Re: API failing with http code 500 on Servicestatus request
Posted: Thu Apr 06, 2017 12:43 pm
by grenley
Tried double (256M). No good.
Tried 512M. Still failed.
Tried 1024M. That was enough memory, but now the php timed out.
Bumped it from 30 secs to 60.
That worked, but it took nearly the entire minute. Already a challenge for a user interface.
We are planning on tripling the number of servers/service checks on that XI server.
That will undoubtedly mean bumping up these values to unreasonable numbers.
We are really counting on the API.
Any thoughts?
Re: API failing with http code 500 on Servicestatus request
Posted: Thu Apr 06, 2017 2:23 pm
by mcapra
If you have a bunch of services, an un-filtered servicestatus request is going to be very big. One way around this would be to limit the results returned by a single request by using a limited query. See "Building Limited Queries" from the API help section for more info.
For example, if I wanted to get all the service statuses I could do this:
Code: Select all
curl -XGET "http://192.168.67.1/nagiosxi/api/v1/objects/servicestatus?apikey=KR2LLsBuhmmFnS4dbmeURW0culVlv39vbbBVW8pet69bXdH8CUiK8DcFX7gMpohD&pretty=1"
But that's really big and takes a long time for PHP to build. A better approach might be to get the records in chunks by using the
records variable in my GET request like so:
Code: Select all
curl -XGET "http://192.168.67.1/nagiosxi/api/v1/objects/servicestatus?apikey=KR2LLsBuhmmFnS4dbmeURW0culVlv39vbbBVW8pet69bXdH8CUiK8DcFX7gMpohD&pretty=1&records=1:10"
Which will only return the first 10 records found. A simple iteration with increments of 10/20/100/etc in your script until no results are found might be a gentler way to get all that information via the API.
Re: API failing with http code 500 on Servicestatus request
Posted: Thu Apr 06, 2017 6:26 pm
by grenley
I like the limited query concept.
So, what I really want is to get all the servicestatus data for a single host.
This totally did the trick and is quite fast:
Code: Select all
&host_name=in:zzzzzzz.att.com&pretty=1
Thanks very much!
Rick
Re: API failing with http code 500 on Servicestatus request
Posted: Fri Apr 07, 2017 9:08 am
by cdienger
Glad that we could help. Did you have any more related questions or is it okay to lock the thread?
Re: API failing with http code 500 on Servicestatus request
Posted: Fri Apr 28, 2017 12:39 pm
by grenley
Hi.
The API seems to have a number of inconsistencies.
I'm trying to delete a host.
First, I get a list of all the services and load them into an array:
Here's the code snippets (from two subroutines):
Code: Select all
$browser = LWP::UserAgent->new;
$url = "http://$xiServer/nagiosxi/api/v1/objects/servicestatus?apikey=$apiKey&host_name=in:$hostName&pretty=1";
$req = HTTP::Request->new(GET => $url);
$response = $browser->request($req);
if (! $response->is_success) {
my $errorMsg = $response->status_line;
print "$errorMsg\n";
exit 1;
}
$responseContent = $response->content;
Code: Select all
my $json = new JSON;
my $perlData = $json->decode($jsonData);
my $allServiceDataHashRef = $perlData->{servicestatuslist};
my $serviceArrayRef = ${$allServiceDataHashRef}{'servicestatus'};
for my $serviceHashRef (@{$serviceArrayRef}) {
my $serviceName = ${$serviceHashRef}{'display_name'};
push @serviceList, $serviceName;
}
Now I run through the array of services names and try to delete them:
Code: Select all
$browser = LWP::UserAgent->new;
$url = "http://$xiServer/nagiosxi/api/v1/config/service?apikey=$apiKey&host_name=$hostName&service_description=$service";
$req = HTTP::Request->new(DELETE => $url);
$response = $browser->request($req);
print $response->content;
As expected, what I get back are a bunch of messages like these:
- {"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-crond from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-vxconfigd from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-BESClient from the system. Config imported but not yet applied."}
Finally, I try to do an applyconfig
Code: Select all
$browser = LWP::UserAgent->new;
$url = "http://$xiServer/nagiosxi/api/v1/system/applyconfig?apikey=$apiKey";
$req = HTTP::Request->new(POST => $url);
$response = $browser->request($req);
When I look at the Nagios console, all the services are still there.
I try an Apply Configuration through CCM and *sometimes* the services disappear in XI but then slowly start trickling back in.
They are definitely not coming from Unconfigured Objects so I can only assume there is some mismatch between Core and XI.
I run all the above code again and I get a smattering of different responses:
- {"error":"Could not find a unique id for this service."}
{"error":"Could not find a unique id for this service."}
{"error":"Could not find a unique id for this service."}
{"error":"Could not find a unique id for this service."}
{"success":"Removed xxxxxx.xxxx.att.com :: Filespace_STAR-var from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx.att.com :: Filespace_STAR-var-adm from the system. Config imported but not yet applied."}
{"error":"Could not find a unique id for this service."}
{"success":"Removed xxxxxx.xxxx :: Filespace_STAR-opt-openv from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx :: Filespace_STAR-var-adm-crash from the system. Config imported but not yet applied."}
Obviously, I can't delete the host, itself, until all the services are gone.
Our enterprise is so huge that we are absolutely counting on the API to function properly.
Should I open a ticket for this?
Thanks,
Rick
Re: API failing with http code 500 on Servicestatus request
Posted: Fri Apr 28, 2017 1:31 pm
by ssax
Please run these commands and post the full output:
Code: Select all
ipcs -q
ps aux
chage -l nagios
chage -l apache
sestatus
When you receive the success messages for deletion, if you login to the CCM does it show the host/services still or are they gone?
Also, send PM one of us a copy of your profile, you can download it by going to
Admin > System Config > System Profile and click the
Download Profile button in the top right corner.
Thank you