Home » Categories » Multiple Categories

How To Clear Solaris Service Maintenance Status

Problem Description

This KB article explains how to clear the Solaris Maintenance Status on a service, specifically the Nagios Core or NRPE service. This KB article will focus specially on Nagios Core and the resolution, however the steps are relevant to NRPE as well.

When the Nagios Core service starts, it verifies the configuration files and if there is an invalid configuration, the Nagios Core service will NOT start. To resolve the problem you must fix the problem Nagios Core is complaining about. This is normal behaviour of Nagios Core, it is not specific to Solaris.

However on Solaris, after a service has failed to start several times, Solaris will put the service into what is called a Maintenance State. This state prevents a small problem from becoming a bigger problem. Even after fixing the problem Nagios Core is complaining about, you must also clear the maintenance state on the service before Solaris allows a service to be started again.

This KB article will show you how to determine what the problem is and how to resolve the issue.

 

Diagnose The Problem

A common scenario is when you have rebooted your Solaris server and the Nagios Core service fails to start.

The first step is to determine why the service did not start.

Execute the following command to see detailed status information:

svcs -xv nagios

 

The output resembles something like this:

svc:/application/nagios:default (?)
 State: maintenance since March  6, 2017 04:57:38 PM EST
Reason: Start method failed repeatedly, last exited with status 8.
   See: http://support.oracle.com/msg/SMF-8000-KS
   See: /var/svc/log/application-nagios:default.log
Impact: This service is not running.

 

It's clear that the service is in a maintenance state, however there is not a lot of detail as to the cause of the issue except that the Start method failed repeatedly. It does however provide the name of a log file /var/svc/log/application-nagios:default.log.

Execute the following command to perform further troubleshooting:

tail -20 /var/svc/log/application-nagios:default.log

 

The output might resemble something like this:

License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
Error: Invalid max_check_attempts value for host 'test'
Error: Could not register host (config file '/usr/local/nagios/etc/objects/localhost.cfg', starting on line 33)
   Error processing object config files!


***> One or more problems was encountered while processing the config files...

     Check your configuration file(s) to ensure that they contain valid
     directives and data defintions.  If you are upgrading from a previous
     version of Nagios, you should be aware that some variables/definitions
     may have been removed or modified in this version.  Make sure to read
     the HTML documentation regarding the config files, as well as the
     'Whats New' section to find out what has changed.

[ Mar  6 16:57:38 Method "start" exited with status 8. ]

 

This is a common Nagios Core problem, the object definition in the configuration file is missing required directives. In this example the template was forgotten and hence all the common options were missing.

Before proceeding you need to fix the error that is being reported. Once you think you've resolved the issue, you can run the verify command to check:

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

 

The output of a successful verify should end like this:

Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

 

At this point however if you try to restart the service you'll find it won't start:

svcadm enable nagios

svcs -xv nagios

 

The output will be like this:

svc:/application/nagios:default (?)
 State: maintenance since March  6, 2017 04:57:38 PM EST
Reason: Start method failed repeatedly, last exited with status 8.
   See: http://support.oracle.com/msg/SMF-8000-KS
   See: /var/svc/log/application-nagios:default.log
Impact: This service is not running.

 

Don't be fooled thinking there is still a Nagios configuration issue. Pay careful attention to the date and time on the state, it hasn't changed. Solaris will refuse to start the service until the maintenance state is cleared.

 

 

Clear Maintenance State

Execute the following command to clear the maintenance state:

svcadm clear nagios

 

Execute the following command to start Nagios:

svcadm enable nagios

 

Now check the state of the service:

svcs -xv nagios

 

The output resembles something like this:

svc:/application/nagios:default (?)
 State: online since March  6, 2017 05:17:21 PM EST
   See: /var/svc/log/application-nagios:default.log
Impact: None.

 

Problem solved, Nagios Core is now running again.

 

 

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/

0 (0)
Article Rating (No Votes)
Rate this article
  • Icon PDFExport to PDF
  • Icon MS-WordExport to MS Word
Attachments Attachments
There are no attachments for this article.
Related Articles RSS Feed
Nagios Core - ERROR: Could not create or update nagios.configtest
Viewed 6271 times since Mon, Oct 16, 2017
Nagios XI - How To Use CA Certificates With check_ldaps Plugin
Viewed 4752 times since Tue, Jul 26, 2016
NRPE - Error While Loading Shared Libraries: libssl.so
Viewed 3726 times since Mon, Jul 17, 2017
Web Browser Reports 330 Error Content Encoding
Viewed 2878 times since Tue, Mar 7, 2017
Nagios Core - Nagios service does not start - Error processing object config files!
Viewed 10443 times since Thu, Jan 21, 2016
NRPE - Packet Size Explained
Viewed 10990 times since Thu, Jun 30, 2016
Nagios Core - How to Think with Nagios to Solve Monitoring Problems - NWC14
Viewed 3535 times since Mon, Feb 8, 2016
NRPE - CHECK_NRPE: Error Receiving Data From Daemon
Viewed 5163 times since Mon, Jul 17, 2017
Nagios XI - Defining Global Environment Variables
Viewed 5009 times since Thu, Mar 17, 2016
NRPE - UNKNOWN: No Handler For That Command
Viewed 7400 times since Mon, Jul 17, 2017