Page 1 of 1

config state & Nagios service behavior

Posted: Thu May 30, 2019 5:17 pm
by Maxwellb99
Hello,

I have two closely related concepts: config state & Nagios service behavior.

Questions of interest:
what are some events that would cause the Nagios service to stop? Specifically as it relates to the API.
Can I check the config state? I'd really be interested in getting the error messages. [see screen cap]
bad_config.PNG
Nagios service behavior
I thought that creating services using the force flag and applying the config (in a bad state) repeatedly, would kill Nagios. But I can't recreate that behavior. Recently I had a host that was created, but the hostgroup was set to NULL. Nagios showed that the config was created but there was no entry for that host in the Nagios instance. I had to create the host manually - to which Nagios told me that there was a duplicate entry. The host then showed up and I was able to remove the NULL hostgroup. Upon removing the host I was able to apply the config & restart the Nagios service.

config state
I have a series of daily scripts, I'm looking for a way that I can check the result & possibly remediate the host(s)/service(s) from a bad state programatically. Is there a way to access the error message? There's times where I create a service & the host doesn't exist. Nagios is kind enough to create a host for me, but it doesn't have any parameters and it's set as inactive. Thus, apply_config fails.

The closest behavior I found from the api is: querying the command_id for the apply_config command. Are these status & event codes with respect to the configs?

{'success': 'Apply config command has been sent to the backend.', 'command_id': 61}
[{'command': '17',
'command_id': '61',
'event_time': '2019-05-30 17:48:17',
'processing_time': None,
'result': None,
'result_code': '0',
'status_code': '0',
'submission_time': '2019-05-30 17:48:17',
'submitter_id': '1'}]

Thanks,
Maxwell Ramirez

Re: config state & Nagios service behavior

Posted: Fri May 31, 2019 11:34 am
by cdienger
Was the hostgroup literally set to "NULL" like in:

Code: Select all

curl -XPOST "https://XI_IP/nagiosxi/api/v1/config/host?apikey=API_KEY&pretty=1" -d "host_name=testapihost&address=127.0.0.1&check_command=check_ping\!3000,80%\!5000,100%&max_check_attempts=2&check_period=24x7&contacts=nagiosadmin&notification_interval=5&notification_period=24x7&hostgroups=NULL&applyconfig=1"
?

I'm testing on a 5.6.1 machine and it does create a host entry in the Nagios instance despite the hostgroup not being valid.

The error message isn't accessible through the API but you could use the objects/host and/or the objects/service endpoint to get record counts to verify the config has been added.

Re: config state & Nagios service behavior

Posted: Sun Jun 02, 2019 4:08 pm
by Maxwellb99
Hello,

I didn't test it from the API. A search in the GUI initially showed no *host*. I added it, it showed that it was a duplicate, and the HG was set to "blank". Upon setting it to a legitimate hostgroup I was able to apply the config.

But I don't particularly care about that. The Core questions are given in red.

Counting is problematic... Here's what I'm trying to accomplish. I wrote scripts to run a daily true-up of our environment. I don't want to have to go in every morning & manually apply the config on each of our Nagios servers to make sure everything is OK. Even if I were to keep track of every addition and removal I'd need to pass that across multiple scripts and I'd need to parse the returned result as some say 'unable to ... use ccm'.

Please respond to the questions in red along with the definition of the 'status_code' and 'event_code' from the apply_config command.

If this is the best we can do, I'm asking for a feature request to return the status of the configs and the error message.

Thanks,
Maxwell Ramirez

Re: config state & Nagios service behavior

Posted: Mon Jun 03, 2019 12:49 pm
by cdienger
what are some events that would cause the Nagios service to stop? Specifically as it relates to the API.
Applying the config will stop the service and then restart it, but I'm assuming you're looking for more of a "what will crash the Nagios service using the API?", in which case I don't have any examples.
Can I check the config state? I'd really be interested in getting the error messages.
The message comes from verification of the configuration which the API doesn't do. I will file a feature request requesting the ability to get verification information through the API.

The "apply_config" command - are you referring to the system/applyconfig endpoint? Can you provide the full command and output so we can look into this?

Re: config state & Nagios service behavior

Posted: Tue Jun 04, 2019 4:12 pm
by Maxwellb99
I was asking about the 'system/command/command_id' endpoint where 'command_id' is the command_id for the previously executed apply_config command via the 'system/applyconfig' endpoint. Specifically, what the 'result_code' and the 'status_code' refer to?

Code: Select all

    
def get_config_status(self, id_in, print_result=0):

        r = requests.get("https://{}/nagiosxi/api/v1/system/command/{}?apikey={}".format(self.curr_nagios, id_in,
                                                                                         api_keys[self.curr_nagios]),
                         verify=False).json()

        if print_result:
            print(self.curr_nagios)
            pprint.pprint(r)

        return r
Can you describe a way to check the configs not through the API? I'm pretty plugged in via Python, Ansible, & shell scripts. Once I have the result I can munge it & return it.

Thanks,
Maxwell Ramirez

Re: config state & Nagios service behavior

Posted: Tue Jun 04, 2019 8:57 pm
by Maxwellb99
Nevermind. I found the reconfigure_nagios.sh script and fixed it.

Re: config state & Nagios service behavior

Posted: Wed Jun 05, 2019 1:41 pm
by lmiltchev
@Maxwellb99, let us know if it is safe to lock this thread. Thanks!

Re: config state & Nagios service behavior

Posted: Wed Jun 05, 2019 1:59 pm
by cdienger
Status codes:

Code: Select all

define("COMMAND_STATUS_QUEUED", 0);
define("COMMAND_STATUS_PROCESSING", 1);
define("COMMAND_STATUS_COMPLETED", 2);
result codes:

Code: Select all

define("COMMAND_RESULT_OK", 0);
define("COMMAND_RESULT_ERROR", 1);
If the configuration fails to apply then the result code will return a "1".

Re: config state & Nagios service behavior

Posted: Thu Jun 06, 2019 10:56 am
by Maxwellb99
cool. Thanks, I'm all set.

Re: config state & Nagios service behavior

Posted: Thu Jun 06, 2019 11:12 am
by cdienger
Sounds good. Locking.