Page 1 of 2

Nagios critical problems with monitoring engine

Posted: Thu Dec 05, 2013 6:48 pm
by linuxnag
Hello,

I just tried to install nagiosxi on a new box (I have a license for another box),
and I see this:
nagios_issue.png
when I click on the "Action->Start" button
(next to the Red Status Icon) I get an error:

"An Error occurred processing your request"

How do I see this error? Where is the log file?

Thanks in advance

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 9:42 am
by tmcdonald
Can you post the last 200 or so lines of your nagios log in code wraps please?

Code: Select all

tail -200 /usr/local/nagios/var/nagios.log

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 10:13 am
by linuxnag
Question:
When I am connected (to the nagios server) through the web,
do I need to have access to any ports (other than 80) ?
My firewall would need to be adjusted if so.

Here is the output you requested:
This is the log while I tried to re-start action.
Note: Nothing popped up!
FYI: This is a brand new install.

Code: Select all

[root@nagios_new_prod:/var/log] # ls -ld /usr/local/nagios/var/nagios.log
-rw-r--r-- 1 nagios nagios 1678 Dec  6 09:30 /usr/local/nagios/var/nagios.log
[root@nagios_new_prod:/var/log] # tail -200 /usr/local/nagios/var/nagios.log
[1386306000] LOG ROTATION: DAILY
[1386306000] LOG VERSION: 2.0
[1386306000] CURRENT HOST STATE: localhost;UP;HARD;1;OK - 127.0.0.1: rta 0.008ms, lost 0%
[1386306000] CURRENT SERVICE STATE: localhost;Current Load;OK;HARD;1;OK - load average: 0.20, 0.26, 0.38
[1386306000] CURRENT SERVICE STATE: localhost;Current Users;OK;HARD;1;USERS OK - 2 users currently logged in
[1386306000] CURRENT SERVICE STATE: localhost;HTTP;OK;HARD;1;HTTP OK HTTP/1.1 200 OK - 2733 bytes in 0.001 seconds
[1386306000] CURRENT SERVICE STATE: localhost;PING;OK;HARD;1;PING OK - Packet loss = 0%, RTA = 0.04 ms
[1386306000] CURRENT SERVICE STATE: localhost;Root Partition;OK;HARD;1;DISK OK - free space: / 5321 MB (55% inode=78%):
[1386306000] CURRENT SERVICE STATE: localhost;SSH;OK;HARD;1;SSH OK - OpenSSH_5.3 (protocol 2.0)
[1386306000] CURRENT SERVICE STATE: localhost;Swap Usage;OK;HARD;1;SWAP OK - 100% free (8191 MB out of 8191 MB)
[1386306000] CURRENT SERVICE STATE: localhost;Total Processes;OK;HARD;1;PROCS OK: 128 processes with STATE = RSZDT
[1386307856] Auto-save of retention data completed successfully.
[1386311456] Auto-save of retention data completed successfully.
[1386315056] Auto-save of retention data completed successfully.
[1386318656] Auto-save of retention data completed successfully.
[1386322256] Auto-save of retention data completed successfully.
[1386325856] Auto-save of retention data completed successfully.
[1386329456] Auto-save of retention data completed successfully.
[1386333056] Auto-save of retention data completed successfully.
[1386336656] Auto-save of retention data completed successfully.
[1386340256] Auto-save of retention data completed successfully.

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 10:27 am
by sreinhardt
It seems that the nagios engine is running correctly. Let's make sure the lock file is in place and readable via apache\nagcmd. As for your port question, if you wish to only use http, that would be fine for the workstation administering the XI instance. 443, is an option, but is not needed, however depending on the checks you are running, and if they go through a firewall or not, you may need additional ports for monitoring items.

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 10:37 am
by linuxnag
What I was worried about was that I needed and extra open port
in order to perform the restart action, using the "Action"
button on the web interface.

Ok, so here is the output you requested:

Code: Select all

[root@nagios_new_prod:/usr/local/nagios/var] # pwd
/usr/local/nagios/var
[root@nagios_new_prod:/usr/local/nagios/var] # ls -ld /usr/local/nagios
drwxr-xr-x 8 root root 4096 Dec  5 18:05 /usr/local/nagios
[root@nagios_new_prod:/usr/local/nagios/var] # ls -ld /usr/local/nagios/var
drwxrwxr-x 6 nagios nagios 4096 Dec  6 10:34 /usr/local/nagios/var
[root@nagios_new_prod:/usr/local/nagios/var] # pwd
/usr/local/nagios/var
[root@nagios_new_prod:/usr/local/nagios/var] # find . -ls | grep -i lock
287367    4 -rw-r--r--   1 nagios   nagios          5 Dec  5 18:09 ./nagios.lock
287391    4 -rw-r--r--   1 nagios   nagios          5 Dec  5 18:09 ./ndo2db.lock
EDIT: Please wrap only your logs in code tags, not your whole post - tmcdonald
EDIT: k, cool. Any news on this?

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 1:05 pm
by slansing
If you stop and start nagios from the command line:

Code: Select all

service nagios stop

killall nagios

service nagios start
Are you then able to restart the process from your web interface? You should not need any additional ports open to pass commands through apache to the nagios server.

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 1:59 pm
by linuxnag
I see the problem:
So far, I've only checked the logs in /usr/local/nagios/var
Old habits...
When I tried looking at:
/usr/local/nagiosxi/var

As I tried re-starting the service:

Code: Select all

==> cmdsubsys.log <==
S COMMAND: CMD=11, DATA=a:1:{i:0;s:0:"";}
CMDLINE=/etc/init.d/nagios start
/etc/init.d/nagios: line 87: /etc/rc.d/init.d/functions: Permission denied
OUTPUT=
RETURNCODE=1

[root@nagios_new_prod:/usr/local/nagiosxi/var] # ls -ld /etc/rc.d/init.d/functions
-rw-------. 1 root root 18172 Jan  6  2013 /etc/rc.d/init.d/functions
[root@nagios_new_prod:/usr/local/nagiosxi/var] # chmod 755 /etc/rc.d/init.d/functions
[root@nagios_new_prod:/usr/local/nagiosxi/var] # ls -ld /etc/rc.d/init.d/functions
-rwxr-xr-x. 1 root root 18172 Jan  6  2013 /etc/rc.d/init.d/functions

Ok, restart now and I see:
==> cmdsubsys.log <==
COMMAND: CMD=13, DATA=a:1:{i:0;s:0:"";}
CMDLINE=/etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.
OUTPUT=Starting nagios: done.
RETURNCODE=0

Service is back up and all is green!

What is the issue with setting 755 to  /etc/rc.d/init.d/functions
anyway?

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 2:48 pm
by slansing
It looks like it had something to do with the service call to the init script. This is resolved now correct?

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 3:02 pm
by linuxnag
Yes, this is resolved but:

What is the issue with setting 755 to /etc/rc.d/init.d/functions
anyway?

Can anyone think of a downside?

Re: Nagios critical problems with monitoring engine

Posted: Fri Dec 06, 2013 3:27 pm
by lmiltchev
Can anyone think of a downside?
I don't see any. What is your concern?