consistent start-stop script to nagiosxi after upgrade 5.5.1
consistent start-stop script to nagiosxi after upgrade 5.5.1
Hello,
I experience an interesting situation after a succesfull upgrade from 5.4.12 to 5.5.1. (actuall 5.5.2 now)
On 5.4 i used the following swquence to stop the nagiosxi proccesses (and vice-versa, from the bottom to top, to start) on rhel7 platform:
service nagiosxi stop
sleep 3
service npcd stop
sleep 3
service ndo2db stop
sleep 3
service nagios stop
sleep 3
#service postgresql stop
service mariadb stop
sleep 3
service httpd stop
In 5.4 it is worked well. Few days ago i restarted our nagiosxi instance with that same script and i found, that some of the proccesses are not running well, irritating... (picture attached)
I tried to start it from the GUI with success, so currently i just not understand why is the failure.
Can you help me what i made wrong, or where i the failure find can?
Thank you, best regards,
Ferenc
I experience an interesting situation after a succesfull upgrade from 5.4.12 to 5.5.1. (actuall 5.5.2 now)
On 5.4 i used the following swquence to stop the nagiosxi proccesses (and vice-versa, from the bottom to top, to start) on rhel7 platform:
service nagiosxi stop
sleep 3
service npcd stop
sleep 3
service ndo2db stop
sleep 3
service nagios stop
sleep 3
#service postgresql stop
service mariadb stop
sleep 3
service httpd stop
In 5.4 it is worked well. Few days ago i restarted our nagiosxi instance with that same script and i found, that some of the proccesses are not running well, irritating... (picture attached)
I tried to start it from the GUI with success, so currently i just not understand why is the failure.
Can you help me what i made wrong, or where i the failure find can?
Thank you, best regards,
Ferenc
You do not have the required permissions to view the files attached to this post.
Re: consistent start-stop script to nagiosxi after upgrade 5
Please attach the output of these commands:
I believe the init script (/etc/init.d/nagios) waits up to 90 seconds for the nagios process to stop, please see here, you can get duplicate processes if you don't wait until it's stopped:
Code: Select all
ps aux | grep nagios.cfg
ipcs -qCode: Select all
# now we have to wait for nagios to exit and remove its
# own NagiosRunFile, otherwise a following "start" could
# happen, and then the exiting nagios will remove the
# new NagiosRunFile, allowing multiple nagios daemons
# to (sooner or later) run - John Sellens
#echo -n 'Waiting for nagios to exit .'
for i in {1..90}; do
if status_nagios > /dev/null; then
echo -n '.'
sleep 1
else
break
fi
done
if status_nagios > /dev/null; then
echo ""
echo "Warning - nagios did not exit in a timely manner - Killing it!"
killproc_nagios KILL
else
echo "done."
fiRe: consistent start-stop script to nagiosxi after upgrade 5
In addition to my previous post, is NPCD running?
If it's not running:
Code: Select all
service npcd statusCode: Select all
service npcd startRe: consistent start-stop script to nagiosxi after upgrade 5
Hello,
On a running instance its seems:
[root@naigos ~]# service npcd status
NPCD running (pid 8423).
[root@nagios ~]# ps aux | grep nagios.cfg
nagios 8372 0.0 0.0 54044 5056 ? Ss Aug17 0:48 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 8380 0.0 0.0 53528 1376 ? S Aug17 0:10 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root 25657 0.0 0.0 112708 996 pts/1 S+ 06:24 0:00 grep --color=auto nagios.cfg
[root@nagios ~]# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xee000002 0 nagios 600 0 0
0x36000002 32769 nagios 600 0 0
after stop
[root@nagios ~]# service npcd status
NPCD not running.
[root@nagios ~]# ps aux | grep nagios.cfg
root 26253 0.0 0.0 112704 992 pts/1 S+ 06:26 0:00 grep --color=auto nagios.cfg
[root@nagios ~]# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xee000002 0 nagios 600 0 0
0x36000002 32769 nagios 600 0 0
then starting the instance
[root@nagios ~]# /root/nagiosxi_full_start.sh
Redirecting to /bin/systemctl start httpd.service
Redirecting to /bin/systemctl start mariadb.service
Starting nagios: done.
Starting ndo2db (via systemctl): [ OK ]
NPCD started.
[root@nagios ~]# service npcd status
NPCD running (pid 27054).
[root@nagios ~]# ps aux | grep nagios.cfg
nagios 26917 0.0 0.0 54044 2952 ? Ss 06:28 0:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 26931 0.0 0.0 53528 1376 ? S 06:28 0:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root 27197 0.0 0.0 112704 996 pts/1 S+ 06:28 0:00 grep --color=auto nagios.cfg
[root@nagios ~]# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xee000002 0 nagios 600 0 0
0x36000002 32769 nagios 600 0 0
0xe3000002 65538 nagios 600 0 0
The GUI shows still that the monitoring engine not started, as well as performance grapher...
Also experience the problem attached in picture format...
Best regards,
Ferenc
On a running instance its seems:
[root@naigos ~]# service npcd status
NPCD running (pid 8423).
[root@nagios ~]# ps aux | grep nagios.cfg
nagios 8372 0.0 0.0 54044 5056 ? Ss Aug17 0:48 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 8380 0.0 0.0 53528 1376 ? S Aug17 0:10 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root 25657 0.0 0.0 112708 996 pts/1 S+ 06:24 0:00 grep --color=auto nagios.cfg
[root@nagios ~]# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xee000002 0 nagios 600 0 0
0x36000002 32769 nagios 600 0 0
after stop
[root@nagios ~]# service npcd status
NPCD not running.
[root@nagios ~]# ps aux | grep nagios.cfg
root 26253 0.0 0.0 112704 992 pts/1 S+ 06:26 0:00 grep --color=auto nagios.cfg
[root@nagios ~]# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xee000002 0 nagios 600 0 0
0x36000002 32769 nagios 600 0 0
then starting the instance
[root@nagios ~]# /root/nagiosxi_full_start.sh
Redirecting to /bin/systemctl start httpd.service
Redirecting to /bin/systemctl start mariadb.service
Starting nagios: done.
Starting ndo2db (via systemctl): [ OK ]
NPCD started.
[root@nagios ~]# service npcd status
NPCD running (pid 27054).
[root@nagios ~]# ps aux | grep nagios.cfg
nagios 26917 0.0 0.0 54044 2952 ? Ss 06:28 0:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 26931 0.0 0.0 53528 1376 ? S 06:28 0:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
root 27197 0.0 0.0 112704 996 pts/1 S+ 06:28 0:00 grep --color=auto nagios.cfg
[root@nagios ~]# ipcs -q
------ Message Queues --------
key msqid owner perms used-bytes messages
0xee000002 0 nagios 600 0 0
0x36000002 32769 nagios 600 0 0
0xe3000002 65538 nagios 600 0 0
The GUI shows still that the monitoring engine not started, as well as performance grapher...
Also experience the problem attached in picture format...
Best regards,
Ferenc
You do not have the required permissions to view the files attached to this post.
Re: consistent start-stop script to nagiosxi after upgrade 5
Hello again,
That last error (check_icmp) is solved. (chowned and setuid changed)
The currently situation generates the other question, where the automatic updates of nagiosxi disabled can... (not the update available check!)
Best regards,
Ferenc
That last error (check_icmp) is solved. (chowned and setuid changed)
The currently situation generates the other question, where the automatic updates of nagiosxi disabled can... (not the update available check!)
Best regards,
Ferenc
Re: consistent start-stop script to nagiosxi after upgrade 5
And hello again 
Also found a reason for the newest version question!
Currently only my question for the Performance Grapher and Monitoring Engine are unanswered.
Best regards,
Ferenc
Also found a reason for the newest version question!
Currently only my question for the Performance Grapher and Monitoring Engine are unanswered.
Best regards,
Ferenc
Re: consistent start-stop script to nagiosxi after upgrade 5
Looks like you have too many kernel message queues (ipcs -q, you should only have one), please run these commands to fix:
Please PM me a copy of your profile so that we can review your settings, you can download it from Admin > System Profile > Download Profile.
Code: Select all
service nagios stop
service ndo2db stop
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
service ndo2db start
service nagios startRe: consistent start-stop script to nagiosxi after upgrade 5
Hello,
I had sended you system profiles, issue experienced on our UAT (5.5.1) and PROD (5.5.2) environment also.
Tried to stop all the services and cleaning out ipcs processes, but the result is the same -> Performance grapher and Monitoring engine still not start correctly...
Thanks, best regards,
Ferenc
I had sended you system profiles, issue experienced on our UAT (5.5.1) and PROD (5.5.2) environment also.
Tried to stop all the services and cleaning out ipcs processes, but the result is the same -> Performance grapher and Monitoring engine still not start correctly...
Thanks, best regards,
Ferenc
Re: consistent start-stop script to nagiosxi after upgrade 5
Received, looking at them now.
Re: consistent start-stop script to nagiosxi after upgrade 5
Please send a copy of your /etc/sudoers and the output of these commands:
Additionally, are your servers AD/LDAP integrated?
And just to clarify, the only issue is that it's showing as red in the interface for the component status, correct?
Code: Select all
grep nag /etc/group
chage -l nagios
grep "User\|Group" /etc/httpd/conf/httpd.confAnd just to clarify, the only issue is that it's showing as red in the interface for the component status, correct?