Duration

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
orani
Posts: 169
Joined: Wed May 06, 2015 3:33 pm

Duration

Post by orani »

How can i maintain the duration after a reboot of nagios server.

Example: I have a host which is down for 3d4h32m. If i restart the nagios server, after reboot this host will still be down but in duration i will see 0d0h1m etc.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Duration

Post by mcapra »

Check the following configuration directives in your main Nagios Core configuration file to make sure they reflect your desired behavior:
https://assets.nagios.com/downloads/nag ... nformation

Code: Select all

retain_state_information
state_retention_file
retention_update_interval
If your state_retention_file is held on a ramdisk, or in some other ephemeral storage (like /tmp on some Linux flavors), it is likely being removed on system restarts regardless of anything Nagios Core attempts.
Former Nagios employee
https://www.mcapra.com/
orani
Posts: 169
Joined: Wed May 06, 2015 3:33 pm

Re: Duration

Post by orani »

Code: Select all

retain_state_information =1
state_retention_file=/usr/local/nagios/var/retention.dat
retention_update_interval=60
Those are my settings. I think that with those settings nagios should keep duration after restart but it does not.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Duration

Post by scottwilkerson »

orani wrote:How can i maintain the duration after a reboot of nagios server.

Example: I have a host which is down for 3d4h32m. If i restart the nagios server, after reboot this host will still be down but in duration i will see 0d0h1m etc.
Can you show the check command that is producing this for this host?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
orani
Posts: 169
Joined: Wed May 06, 2015 3:33 pm

Re: Duration

Post by orani »

This is not happening at a single check but at all checks.

Those are some of my definitions

Code: Select all

define host{
        host_name       Infra1
        address         10.0.6.11
        use host-pnp
        contacts        nagiosadmin,systemadmin
        check_command check-host-alive
        max_check_attempts      5
        check_interval          1
        retry_interval          1
        parents SMCSW
        check_period            24x7
        hostgroups KDS PHYSICAL SERVERS
        statusmap_image rack-server.gd2
        notification_interval   0
        notification_options    d,u,r,s
        notification_period     24x7
}



define service{
        host_name       Infra1
        use service-pnp
        check_command   check_nt!CPULOAD! -l 5,80,90 -s nagios
        contacts        nagiosadmin,systemadmin
        max_check_attempts      5
        service_description     CPU Load
        check_interval          1
        retry_interval          1
        check_period            24x7
        servicegroups           CPU Load
        notification_interval   0
        notification_options    w,u,c,r,s
}
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Duration

Post by scottwilkerson »

Can you show the output of the following

Code: Select all

grep reta /usr/local/nagios/etc/nagios.cfg
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
orani
Posts: 169
Joined: Wed May 06, 2015 3:33 pm

Re: Duration

Post by orani »

[root@nagios ~]# grep reta /usr/local/nagios/etc/nagios.cfg
retain_state_information=1
# This file is used only if the retain_state_information
# retention file. If you want to use retained program status
use_retained_program_state=1
# This setting determines whether or not Nagios will retain
# If you want to use retained scheduling info, set this
use_retained_scheduling_info=1
# service attributes that should *not* be retained by Nagios during
# of flap detection and event handlers for hosts to be retained, you
# This mask determines what host attributes are not retained
retained_host_attribute_mask=0
# This mask determines what service attributes are not retained
retained_service_attribute_mask=0
# These two masks determine what process attributes are not retained.
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
# These two masks determine what contact attributes are not retained.
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Duration

Post by scottwilkerson »

That all looks correct, let's look at permissions on the retention.dat

Code: Select all

ls -al /usr/local/nagios/var/retention.dat
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
orani
Posts: 169
Joined: Wed May 06, 2015 3:33 pm

Re: Duration

Post by orani »

Code: Select all

[root@nagios ~]# ls -al /usr/local/nagios/var/retention.dat
-rw------- 1 nagios nagios 2857104 Sep  9 19:06 /usr/local/nagios/var/retention.dat
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Duration

Post by scottwilkerson »

Does it reset the duration if you just restart the nagios service?

Code: Select all

service nagios restart
All looks proper so far, the only thing that I don't know if if for some reason on rebooting the Nagios server you are losing the /usr/local/nagios/var/retention.dat file
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked