check_nrpe+systemd wildcard issue
Re: check_nrpe+systemd wildcard issue
Moving SELinux over to "Permissive" caused everything to start working as expected. So I think my colleague was onto something. I'll be brushing up on that and will report back here if/when I find a solution.
Re: check_nrpe+systemd wildcard issue
Interesting! Thank you for letting us know. We'll keep an eye out for your results.
Re: check_nrpe+systemd wildcard issue
Okay, so, yes, SELinux was definitely causing the issue. The difference in behavior based on how the service was launched just related to the associated contexts. The solution is very specific to my setup, but I will go through the steps I used to diagnose and fix the issue:
Testing to see if SELInux is the culprit:
Testing to see if SELInux is the culprit:
- Check to see if SELinux is enabled by running "getenforce." It will return either "Disabled", "Permissive", or "Enforcing". If it returns either of the first two, then SELinux is most likely not the problem.
- Assuming the previous step returned "Enforcing", temporarily tell SELinux to let things through by running the command "setenforce 0". This will move SELinux into "Permissive" mode. Run "getenforce" again to ensure that is now the case.
- Attempt the check_nrpe call from the Nagios server. If it now works, then SELinux is very likely the culprit. If it does not, then SELinux is probably not the culprit.
- Move SELinux back to "Enforcing" by running the command "setenforce 1". Run "getenforce" to verify that the change took place.
- Determine where your SELinux audit logs are. The standard location is /var/log/audit/audit.log.
- Run the check_nrpe command that is failing (from the Nagios server). Check to make sure it failed.
- Look for recent "denied" messages in the audit log - either by tail -f'ing the log or grepping for the word "denied" and matching the timestamp. When you find the message, you will most likely see the plugin name. NOTE: there may be more than one failure. I found tail -f'ing the file to be helpful here as I could see all audit messages getting logged in real time and could easily associate them to my check_nrpe calls.
- Copy those "denied" messages into a text file
- Run "audit2why -i <text_file>" - this will tell you what needs to be done. In my case, I had two things to do: update a boolean config value using "setsebool", and create a new type enforcement rule for SELinux.
- NOTE: you should really understand SELinux before going to the following steps - or find someone who does.
- Creating the new type enforcement rule was pretty easy. I ran "audit2allow -i <text_file> -M <outputFileStem>" (where "text_file" is the copy of the "denied" messages and "outputFileStem" is the name you want to use for the output files, sans extension.
- The above step will create a .te and .pp file. The .te file is human-readable and tells you what rules it is creating. The .pp file is used to load the new rule.
- (Definitely do NOT do this step if you do not know what you are doing - find someone who does or start reading up on SELinux first!!) Load the new rule into SELinux by running "semodule -i outputFileStem.pp"
- Everything should work now.