API failing with http code 500 on Servicestatus request

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
grenley
Posts: 96
Joined: Tue May 13, 2014 6:06 pm

API failing with http code 500 on Servicestatus request

Post by grenley »

Hi.

All of a sudden, on two of our production XI servers, the objects/servicestatus api is failing.
When called through the XI (Help) gui, nothing is returned.
When I call it through a script, I'm getting "500 Internal Server Error"

objects/hoststatus is working fine so it's not an api key issue or anything like that.
table corruption, perhaps?

What kind of info can I give you to help troubleshoot?

Thanks,
Rick
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: API failing with http code 500 on Servicestatus request

Post by avandemore »

The apache error log would be the first place to start. Usually that is located at /var/log/httpd/error_log.
Previous Nagios employee
grenley
Posts: 96
Joined: Tue May 13, 2014 6:06 pm

Re: API failing with http code 500 on Servicestatus request

Post by grenley »

yep...
[Wed Apr 05 18:33:52.314577 2017] [:error] [pid 82718] [client xxx.xx.xx.xxx:57309] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 2 bytes) in /opt/app/nagiosxi/html/backend/includes/xml2json.php on line 243
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: API failing with http code 500 on Servicestatus request

Post by mcapra »

Can you try increasing your memory_limit in /etc/php.ini to roughly double what it is now? Be sure to do a service httpd restart afterwards to apply the changes.
Former Nagios employee
https://www.mcapra.com/
grenley
Posts: 96
Joined: Tue May 13, 2014 6:06 pm

Re: API failing with http code 500 on Servicestatus request

Post by grenley »

Tried double (256M). No good.
Tried 512M. Still failed.
Tried 1024M. That was enough memory, but now the php timed out.
Bumped it from 30 secs to 60.
That worked, but it took nearly the entire minute. Already a challenge for a user interface.
We are planning on tripling the number of servers/service checks on that XI server.
That will undoubtedly mean bumping up these values to unreasonable numbers.
We are really counting on the API.
Any thoughts?
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: API failing with http code 500 on Servicestatus request

Post by mcapra »

If you have a bunch of services, an un-filtered servicestatus request is going to be very big. One way around this would be to limit the results returned by a single request by using a limited query. See "Building Limited Queries" from the API help section for more info.

For example, if I wanted to get all the service statuses I could do this:

Code: Select all

curl -XGET "http://192.168.67.1/nagiosxi/api/v1/objects/servicestatus?apikey=KR2LLsBuhmmFnS4dbmeURW0culVlv39vbbBVW8pet69bXdH8CUiK8DcFX7gMpohD&pretty=1"
But that's really big and takes a long time for PHP to build. A better approach might be to get the records in chunks by using the records variable in my GET request like so:

Code: Select all

curl -XGET "http://192.168.67.1/nagiosxi/api/v1/objects/servicestatus?apikey=KR2LLsBuhmmFnS4dbmeURW0culVlv39vbbBVW8pet69bXdH8CUiK8DcFX7gMpohD&pretty=1&records=1:10"
Which will only return the first 10 records found. A simple iteration with increments of 10/20/100/etc in your script until no results are found might be a gentler way to get all that information via the API.
Former Nagios employee
https://www.mcapra.com/
grenley
Posts: 96
Joined: Tue May 13, 2014 6:06 pm

Re: API failing with http code 500 on Servicestatus request

Post by grenley »

I like the limited query concept.
So, what I really want is to get all the servicestatus data for a single host.
This totally did the trick and is quite fast:

Code: Select all

&host_name=in:zzzzzzz.att.com&pretty=1
Thanks very much!
Rick
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: API failing with http code 500 on Servicestatus request

Post by cdienger »

Glad that we could help. Did you have any more related questions or is it okay to lock the thread?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
grenley
Posts: 96
Joined: Tue May 13, 2014 6:06 pm

Re: API failing with http code 500 on Servicestatus request

Post by grenley »

Hi.
The API seems to have a number of inconsistencies.
I'm trying to delete a host.
First, I get a list of all the services and load them into an array:
Here's the code snippets (from two subroutines):

Code: Select all

    $browser = LWP::UserAgent->new;

    $url = "http://$xiServer/nagiosxi/api/v1/objects/servicestatus?apikey=$apiKey&host_name=in:$hostName&pretty=1";

    $req = HTTP::Request->new(GET => $url);

    $response = $browser->request($req);

    if (! $response->is_success) {
        my $errorMsg = $response->status_line;
        print "$errorMsg\n";
        exit 1;
    }

    $responseContent = $response->content;

Code: Select all

    my $json = new JSON;
    my $perlData = $json->decode($jsonData);

    my $allServiceDataHashRef = $perlData->{servicestatuslist};
    my $serviceArrayRef = ${$allServiceDataHashRef}{'servicestatus'};

    for my $serviceHashRef (@{$serviceArrayRef}) {
        my $serviceName = ${$serviceHashRef}{'display_name'};
        push @serviceList, $serviceName;
    }
Now I run through the array of services names and try to delete them:

Code: Select all

        $browser = LWP::UserAgent->new;
        $url = "http://$xiServer/nagiosxi/api/v1/config/service?apikey=$apiKey&host_name=$hostName&service_description=$service";

        $req = HTTP::Request->new(DELETE => $url);

        $response = $browser->request($req);

        print $response->content;
As expected, what I get back are a bunch of messages like these:
  • {"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-crond from the system. Config imported but not yet applied."}
    {"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-vxconfigd from the system. Config imported but not yet applied."}
    {"success":"Removed xxxxxx.xxxx.att.com :: Proc_STAR-BESClient from the system. Config imported but not yet applied."}
Finally, I try to do an applyconfig

Code: Select all

    $browser = LWP::UserAgent->new;

    $url = "http://$xiServer/nagiosxi/api/v1/system/applyconfig?apikey=$apiKey";

    $req = HTTP::Request->new(POST => $url);

    $response = $browser->request($req);
When I look at the Nagios console, all the services are still there.
I try an Apply Configuration through CCM and *sometimes* the services disappear in XI but then slowly start trickling back in.
They are definitely not coming from Unconfigured Objects so I can only assume there is some mismatch between Core and XI.

I run all the above code again and I get a smattering of different responses:
  • {"error":"Could not find a unique id for this service."}
    {"error":"Could not find a unique id for this service."}
    {"error":"Could not find a unique id for this service."}
    {"error":"Could not find a unique id for this service."}
    {"success":"Removed xxxxxx.xxxx.att.com :: Filespace_STAR-var from the system. Config imported but not yet applied."}
    {"success":"Removed xxxxxx.xxxx.att.com :: Filespace_STAR-var-adm from the system. Config imported but not yet applied."}
    {"error":"Could not find a unique id for this service."}
    {"success":"Removed xxxxxx.xxxx :: Filespace_STAR-opt-openv from the system. Config imported but not yet applied."}
    {"success":"Removed xxxxxx.xxxx :: Filespace_STAR-var-adm-crash from the system. Config imported but not yet applied."}
Obviously, I can't delete the host, itself, until all the services are gone.
Our enterprise is so huge that we are absolutely counting on the API to function properly.
Should I open a ticket for this?

Thanks,
Rick
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: API failing with http code 500 on Servicestatus request

Post by ssax »

Please run these commands and post the full output:

Code: Select all

ipcs -q
ps aux
chage -l nagios
chage -l apache
sestatus
When you receive the success messages for deletion, if you login to the CCM does it show the host/services still or are they gone?

Also, send PM one of us a copy of your profile, you can download it by going to Admin > System Config > System Profile and click the Download Profile button in the top right corner.


Thank you
Locked