API failing with http code 500 on Servicestatus request
API failing with http code 500 on Servicestatus request
Hi.
All of a sudden, on two of our production XI servers, the objects/servicestatus api is failing.
When called through the XI (Help) gui, nothing is returned.
When I call it through a script, I'm getting "500 Internal Server Error"
objects/hoststatus is working fine so it's not an api key issue or anything like that.
table corruption, perhaps?
What kind of info can I give you to help troubleshoot?
Thanks,
Rick
All of a sudden, on two of our production XI servers, the objects/servicestatus api is failing.
When called through the XI (Help) gui, nothing is returned.
When I call it through a script, I'm getting "500 Internal Server Error"
objects/hoststatus is working fine so it's not an api key issue or anything like that.
table corruption, perhaps?
What kind of info can I give you to help troubleshoot?
Thanks,
Rick
-
avandemore
- Posts: 1597
- Joined: Tue Sep 27, 2016 4:57 pm
Re: API failing with http code 500 on Servicestatus request
The apache error log would be the first place to start. Usually that is located at /var/log/httpd/error_log.
Previous Nagios employee
Re: API failing with http code 500 on Servicestatus request
yep...
[Wed Apr 05 18:33:52.314577 2017] [:error] [pid 82718] [client xxx.xx.xx.xxx:57309] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 2 bytes) in /opt/app/nagiosxi/html/backend/includes/xml2json.php on line 243
Re: API failing with http code 500 on Servicestatus request
Can you try increasing your memory_limit in /etc/php.ini to roughly double what it is now? Be sure to do a service httpd restart afterwards to apply the changes.
Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
Re: API failing with http code 500 on Servicestatus request
Tried double (256M). No good.
Tried 512M. Still failed.
Tried 1024M. That was enough memory, but now the php timed out.
Bumped it from 30 secs to 60.
That worked, but it took nearly the entire minute. Already a challenge for a user interface.
We are planning on tripling the number of servers/service checks on that XI server.
That will undoubtedly mean bumping up these values to unreasonable numbers.
We are really counting on the API.
Any thoughts?
Tried 512M. Still failed.
Tried 1024M. That was enough memory, but now the php timed out.
Bumped it from 30 secs to 60.
That worked, but it took nearly the entire minute. Already a challenge for a user interface.
We are planning on tripling the number of servers/service checks on that XI server.
That will undoubtedly mean bumping up these values to unreasonable numbers.
We are really counting on the API.
Any thoughts?
Re: API failing with http code 500 on Servicestatus request
If you have a bunch of services, an un-filtered servicestatus request is going to be very big. One way around this would be to limit the results returned by a single request by using a limited query. See "Building Limited Queries" from the API help section for more info.
For example, if I wanted to get all the service statuses I could do this:
But that's really big and takes a long time for PHP to build. A better approach might be to get the records in chunks by using the records variable in my GET request like so:
Which will only return the first 10 records found. A simple iteration with increments of 10/20/100/etc in your script until no results are found might be a gentler way to get all that information via the API.
For example, if I wanted to get all the service statuses I could do this:
Code: Select all
curl -XGET "http://192.168.67.1/nagiosxi/api/v1/objects/servicestatus?apikey=KR2LLsBuhmmFnS4dbmeURW0culVlv39vbbBVW8pet69bXdH8CUiK8DcFX7gMpohD&pretty=1"Code: Select all
curl -XGET "http://192.168.67.1/nagiosxi/api/v1/objects/servicestatus?apikey=KR2LLsBuhmmFnS4dbmeURW0culVlv39vbbBVW8pet69bXdH8CUiK8DcFX7gMpohD&pretty=1&records=1:10"Former Nagios employee
https://www.mcapra.com/
https://www.mcapra.com/
Re: API failing with http code 500 on Servicestatus request
I like the limited query concept.
So, what I really want is to get all the servicestatus data for a single host.
This totally did the trick and is quite fast:
Thanks very much!
Rick
So, what I really want is to get all the servicestatus data for a single host.
This totally did the trick and is quite fast:
Code: Select all
&host_name=in:zzzzzzz.att.com&pretty=1Rick
Re: API failing with http code 500 on Servicestatus request
Glad that we could help. Did you have any more related questions or is it okay to lock the thread?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: API failing with http code 500 on Servicestatus request
Hi.
The API seems to have a number of inconsistencies.
I'm trying to delete a host.
First, I get a list of all the services and load them into an array:
Here's the code snippets (from two subroutines):
Now I run through the array of services names and try to delete them:
As expected, what I get back are a bunch of messages like these:
When I look at the Nagios console, all the services are still there.
I try an Apply Configuration through CCM and *sometimes* the services disappear in XI but then slowly start trickling back in.
They are definitely not coming from Unconfigured Objects so I can only assume there is some mismatch between Core and XI.
I run all the above code again and I get a smattering of different responses:
Our enterprise is so huge that we are absolutely counting on the API to function properly.
Should I open a ticket for this?
Thanks,
Rick
The API seems to have a number of inconsistencies.
I'm trying to delete a host.
First, I get a list of all the services and load them into an array:
Here's the code snippets (from two subroutines):
Code: Select all
$browser = LWP::UserAgent->new;
$url = "http://$xiServer/nagiosxi/api/v1/objects/servicestatus?apikey=$apiKey&host_name=in:$hostName&pretty=1";
$req = HTTP::Request->new(GET => $url);
$response = $browser->request($req);
if (! $response->is_success) {
my $errorMsg = $response->status_line;
print "$errorMsg\n";
exit 1;
}
$responseContent = $response->content;
Code: Select all
my $json = new JSON;
my $perlData = $json->decode($jsonData);
my $allServiceDataHashRef = $perlData->{servicestatuslist};
my $serviceArrayRef = ${$allServiceDataHashRef}{'servicestatus'};
for my $serviceHashRef (@{$serviceArrayRef}) {
my $serviceName = ${$serviceHashRef}{'display_name'};
push @serviceList, $serviceName;
}
Code: Select all
$browser = LWP::UserAgent->new;
$url = "http://$xiServer/nagiosxi/api/v1/config/service?apikey=$apiKey&host_name=$hostName&service_description=$service";
$req = HTTP::Request->new(DELETE => $url);
$response = $browser->request($req);
print $response->content;
- {"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-crond from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-vxconfigd from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-BESClient from the system. Config imported but not yet applied."}
Code: Select all
$browser = LWP::UserAgent->new;
$url = "http://$xiServer/nagiosxi/api/v1/system/applyconfig?apikey=$apiKey";
$req = HTTP::Request->new(POST => $url);
$response = $browser->request($req);
I try an Apply Configuration through CCM and *sometimes* the services disappear in XI but then slowly start trickling back in.
They are definitely not coming from Unconfigured Objects so I can only assume there is some mismatch between Core and XI.
I run all the above code again and I get a smattering of different responses:
- {"error":"Could not find a unique id for this service."}
{"error":"Could not find a unique id for this service."}
{"error":"Could not find a unique id for this service."}
{"error":"Could not find a unique id for this service."}
{"success":"Removed xxxxxx.xxxx.att.com :: Filespace_STAR-var from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx.att.com :: Filespace_STAR-var-adm from the system. Config imported but not yet applied."}
{"error":"Could not find a unique id for this service."}
{"success":"Removed xxxxxx.xxxx :: Filespace_STAR-opt-openv from the system. Config imported but not yet applied."}
{"success":"Removed xxxxxx.xxxx :: Filespace_STAR-var-adm-crash from the system. Config imported but not yet applied."}
Our enterprise is so huge that we are absolutely counting on the API to function properly.
Should I open a ticket for this?
Thanks,
Rick
Re: API failing with http code 500 on Servicestatus request
Please run these commands and post the full output:
When you receive the success messages for deletion, if you login to the CCM does it show the host/services still or are they gone?
Also, send PM one of us a copy of your profile, you can download it by going to Admin > System Config > System Profile and click the Download Profile button in the top right corner.
Thank you
Code: Select all
ipcs -q
ps aux
chage -l nagios
chage -l apache
sestatus
Also, send PM one of us a copy of your profile, you can download it by going to Admin > System Config > System Profile and click the Download Profile button in the top right corner.
Thank you