example host delete api call claims to work, appears not to

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
lukesullivan
Posts: 34
Joined: Tue Jan 24, 2017 11:12 am

example host delete api call claims to work, appears not to

Post by lukesullivan »

I'm running nagiosxi 5.3.2 on rhel 6.8 64bit, manual install of nagiosxi

I've written some tests for creating hosts / services via the rest api, and that seems to be working fine.

When I try to delete services or hosts, the api claims that the hosts/services are removed, but they continue to appear in the nagiosxi ui. I tried following along with the example code in the api doc provided from in the product.

[lukas@nagiosxi-60ox ~]$ curl -k -XPOST "https://nagiosxi-60ox.noc.harvard.edu/n ... d&pretty=1" -d "host_name=testapihostapply&address=127.0.0.1&check_command=check_ping\!3000,80%\!5000,100%&max_check_attempts=2&check_period=24x7&contacts=nagiosadmin&notification_interval=5&notification_period=24x7&applyconfig=1"
{
"success": "Successfully added testapihostapply to the system. Config applied, Nagios Core was restarted."
}
[lukas@nagiosxi-60ox ~]$ curl -k -XDELETE "https://nagiosxi-60ox.noc.harvard.edu/n ... lyconfig=1"
{
"success": "Removed testapihostapply from the system. Config applied, Nagios Core was restarted."
}

the host is not actually removed from nagiosxi.

Is there something that I'm missing? The user whose api key I'm using above has admin access.

The host has no services associated, it is not a member of any hostgroup. It's about as simple a case as you can get.

thanks,

-Luke
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: example host delete api call claims to work, appears not

Post by rkennedy »

It looks proper based on what you've done. Based on your version though, there were two bugs fixed in the past that could be relating to this not working properly.

Can you upgrade to the latest version, and let us know if the issue persists?
Former Nagios Employee
lukesullivan
Posts: 34
Joined: Tue Jan 24, 2017 11:12 am

Re: example host delete api call claims to work, appears not

Post by lukesullivan »

partial success.

I've update the version of nagios, now on 5.4.2.

when I now call the delete endpoint, it appears to somewhat work.

the basic curl examples from the doc appear to work. However, I'm not going to just use curl on the command line... so I have a small python script that does the actual job I'm looking for.

for a host with 20+ services, I just want to delete all services. I use the get method to obtain all services for a given host, then iterate through calling the delete method for each service. The http responses appear to look ok (below). I then call the applyconfig method to finalize all of the service removals. That applyconfig appears to succeed. However, when I go into the web ui, there is still a need for me to hit the "apply config" button. Sometimes hitting that apply config button just hangs (lots of progress dots, no actual progress). When the apply config does complete, I go look at services, and the services are still there. A sample of the debug output from my python code is below.

DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): nagiosxi-60ox.noc.harvard.edu
/opt/nagios-python-venv/lib/python3.6/site-packages/requests/packages/urllib3/connectionpool.py:852: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/lates ... l-warnings
InsecureRequestWarning)
DEBUG:requests.packages.urllib3.connectionpool:https://nagiosxi-60ox.noc.harvard.edu:443 "DELETE /nagiosxi/api/v1/config/service?apikey=v2b38L8pHjI3S03NE5MS6vZOaImkrFpZWTi5FptCLrUTseG6iJeHetrHb8hFMJUJ&pretty=1&host_name=akron-mdf-sw1.net.harvard.edu&service_description=---%20Nu0 HTTP/1.1" 200 124
DEBUG:root:OK
DEBUG:root:200
DEBUG:root:{
"success": "Removed akron-mdf-sw1.net.harvard.edu :: --- Nu0 from the system. Config imported but not yet applied."
}

DEBUG:root:b'{\n "success": "Removed akron-mdf-sw1.net.harvard.edu :: --- Nu0 from the system. Config imported but not yet applied."\n}\n'
DEBUG:root:calling nagios applyconfig method
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): nagiosxi-60ox.noc.harvard.edu
/opt/nagios-python-venv/lib/python3.6/site-packages/requests/packages/urllib3/connectionpool.py:852: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/lates ... l-warnings
InsecureRequestWarning)
DEBUG:requests.packages.urllib3.connectionpool:https://nagiosxi-60ox.noc.harvard.edu:443 "POST /nagiosxi/api/v1/system/applyconfig?apikey=v2b38L8pHjI3S03NE5MS6vZOaImkrFpZWTi5FptCLrUTseG6iJeHetrHb8hFMJUJ&pretty=1 HTTP/1.1" 200 72
DEBUG:root:OK
DEBUG:root:200
DEBUG:root:{
"success": "Apply config command has been sent to the backend."
}

DEBUG:root:b'{\n "success": "Apply config command has been sent to the backend."\n}\n'
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: example host delete api call claims to work, appears not

Post by rkennedy »

There may be something hanging with your apply configuration at this point, which in turn would affect the API's ability to apply.

Can you please run tail -f /usr/local/nagiosxi/var/cmdsubsys.log and then attempt to apply? Please post back what's outputted from this, as it'll provide some verbosity to what's going on under the hood.
Former Nagios Employee
lukesullivan
Posts: 34
Joined: Tue Jan 24, 2017 11:12 am

Re: example host delete api call claims to work, appears not

Post by lukesullivan »

from a healthy nagiosxi state, I tailed the cmdsubsys.log file, and then ran the script I have.

the script iterates over each service on a host, calling the http delete method, with applyconfig=1.

the script sleeps for 5 seconds between delete calls.

when the script is running, the nagiosxi web ui begins to show incomplete results (doesnt have a list of all hosts / services / hostgroups).

I have attached the tail of the cmdsubsys.log, as captured from an ssh session.

Please let me know if there is anything interesting in that file, or what additional diagnostics are needed.

This is a high priority issue for me, preventing a go live of a production monitoring system.

thanks,

-Luke
You do not have the required permissions to view the files attached to this post.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: example host delete api call claims to work, appears not

Post by rkennedy »

The cmdsubsys looks fine, every time it applied it looks to do so successfully.
when the script is running, the nagiosxi web ui begins to show incomplete results (doesnt have a list of all hosts / services / hostgroups).
This is expected, an apply configuration is re-writing out your SQL database to flat files, and this does take time to process.
the script sleeps for 5 seconds between delete calls.
Are you applying configuration after EVERY delete call, or only at the end of your array? I've seen apply configuration take up to a couple minutes to fully run through the process, so a 5 second wait may not be enough. If you're doing so after every call, I would modify your script to only run it once, at the end of your array.
Former Nagios Employee
lukesullivan
Posts: 34
Joined: Tue Jan 24, 2017 11:12 am

Re: example host delete api call claims to work, appears not

Post by lukesullivan »

I can modify the script to do that.

I would be interested to know if there is a way to tell if the applyconfig request is still running.

What happens if I post another applyconfig before the previous one finishes? Does that cause nagios to wind up in an unpredictable state?

What is the maximum frequency at which I should run applyconfig?

When the applyconfig is running, is the core nagios engine still working, can I expect health checks to still be run?

thanks,

-Luke
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: example host delete api call claims to work, appears not

Post by rkennedy »

What happens if I post another applyconfig before the previous one finishes? Does that cause nagios to wind up in an unpredictable state?
Currently, it does not queue. Depending on where it's at, the file will be locked and it'll attempt to apply once again pending timeouts I believe. I brought up a FR a while back, to add 'apply queuing' - FR #10434.
What is the maximum frequency at which I should run applyconfig?
This is really dependent on your environment. Personally, I would run it once every few hours. Some larger environments may take longer, which is why you may want to hold off on doing it so often.
When the applyconfig is running, is the core nagios engine still working, can I expect health checks to still be run?
It will first write all of your configuration files out from the database which takes time, then verify them. If that works successfully, then it'll run a restart. The restart shouldn't take very long at all, so I wouldn't expect to miss any checks, but this could become a problem with an overworked XI machine.
Former Nagios Employee
lukesullivan
Posts: 34
Joined: Tue Jan 24, 2017 11:12 am

Re: example host delete api call claims to work, appears not

Post by lukesullivan »

thanks. The reason I'm doing this is I want field techs to be able to reconfigure monitoring when making changes to network devices (add monitoring when they bring up a new interface, remove monitoring when they're decomm'ing a device). I need some options to do it in bulk, although I expect it wont be done very often in regular practice, but I do want to be able to have them do it on demand.

the tool I'm writing will run on the nagios box itself, so if it was possible to have the script make a determination if there is a current running apply, I could catch that, and abort / warn the user to wait.

if you invoke the applyconfig via the api, and the applyconfig runs long, would it make sense to adjust the php timers, as per

https://support.nagios.com/kb/article.php?id=34

thanks,

-Luke
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: example host delete api call claims to work, appears not

Post by rkennedy »

if you invoke the applyconfig via the api, and the applyconfig runs long, would it make sense to adjust the php timers, as per

https://support.nagios.com/kb/article.php?id=34
Yes and no, this is generally just for configs where you may be importing hundreds of hosts / services at once. If you're at that kind of scale, then yes, it may make sense.

I wrote an article out that helps explain the API / templates / hostgroups - you might find it handy. It doesn't directly relate to the work you're doing, but there may be bits you find useful.

Another note, since you're planning to run the tool on the local nagios machine, during an apply configuration, I believe a lock file is created which you could check for existence, then a wait loop to go back and check it again. It's located in the /usr/local/nagiosxi/scripts/ directory - LOCKFILE="$BASEDIR/reconfigure_nagios.lock"
You do not have the required permissions to view the files attached to this post.
Former Nagios Employee
Locked