Fake Alert

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
ericssonvietnam
Posts: 239
Joined: Mon Jun 27, 2016 11:05 pm

Fake Alert

Post by ericssonvietnam »

Hi Team,

I have checked and found that there was no issue on the server but i still got out of bound alert for the below node can you let me know what all things i can check from my end to get the solution for this.


Also i have observed similar issue for CPU usage as well.


CPU Usage





Critical

3d 6h 4m 33s

5/5

2018-05-11 17:05:38

CPU used 100.0% (>95) : CRITICAL
alert-1.PNG
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Fake Alert

Post by scottwilkerson »

In order to diagnose the issue you are having we would need to know what plugin you are running and what arguments you are passing to it.

Can you attach the plugin along with the command that nagios is running that is returning these result?

If this is a NRPE call, we would need the plugin that is running on the remote system and the command.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
ericssonvietnam
Posts: 239
Joined: Mon Jun 27, 2016 11:05 pm

Re: Fake Alert

Post by ericssonvietnam »

scottwilkerson wrote:In order to diagnose the issue you are having we would need to know what plugin you are running and what arguments you are passing to it.

Can you attach the plugin along with the command that nagios is running that is returning these result?

If this is a NRPE call, we would need the plugin that is running on the remote system and the command.

Hi i am using the below two argument to generate the alert and running the below command with the help of the check_by_ssh plugin.

-C "/home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf"

-t 60 -o StrictHostKeyChecking=no -l nagios -E

Below are the logs from /var/log/secure Let me know if anything unusual observation.

May 15 03:01:26 dnrbta sshd[5996]: Connection from 10.10.164.52 port 40186
May 15 03:01:26 dnrbta sshd[5996]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:26 dnrbta sshd[5997]: Postponed publickey for nagios from 10.10.164.52 port 40186 ssh2
May 15 03:01:26 dnrbta sshd[5996]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:26 dnrbta sshd[5996]: Accepted publickey for nagios from 10.10.164.52 port 40186 ssh2
May 15 03:01:26 dnrbta sshd[5996]: pam_unix(sshd:session): session opened for user nagios by (uid=0)
May 15 03:01:27 dnrbta sshd[5998]: Received disconnect from 10.10.164.52: 11: disconnected by user
May 15 03:01:27 dnrbta sshd[5996]: pam_unix(sshd:session): session closed for user nagios

May 15 03:01:36 dnrbta sshd[6031]: Connection from 10.10.164.52 port 40222
May 15 03:01:36 dnrbta sshd[6031]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:36 dnrbta sshd[6032]: Postponed publickey for nagios from 10.10.164.52 port 40222 ssh2
May 15 03:01:36 dnrbta sshd[6031]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:36 dnrbta sshd[6031]: Accepted publickey for nagios from 10.10.164.52 port 40222 ssh2
May 15 03:01:36 dnrbta sshd[6031]: pam_unix(sshd:session): session opened for user nagios by (uid=0)
May 15 03:01:36 dnrbta sshd[6033]: Received disconnect from 10.10.164.52: 11: disconnected by user
May 15 03:01:36 dnrbta sshd[6031]: pam_unix(sshd:session): session closed for user nagios
May 15 03:01:39 dnrbta sshd[6058]: Connection from 10.10.164.52 port 40243
May 15 03:01:39 dnrbta sshd[6059]: Connection closed by 10.10.164.52
May 15 03:01:48 dnrbta sshd[6060]: Connection from 10.10.164.52 port 40284
May 15 03:01:48 dnrbta sshd[6061]: Connection closed by 10.10.164.52
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Fake Alert

Post by scottwilkerson »

On the remote computer, if you su to nagios and run this command what do you get

Code: Select all

su nagios
/home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf
echo $?
ls -al /home/nagios/logfile_basedir/conf/PSP_linux.conf
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
ericssonvietnam
Posts: 239
Joined: Mon Jun 27, 2016 11:05 pm

Re: Fake Alert

Post by ericssonvietnam »

scottwilkerson wrote:On the remote computer, if you su to nagios and run this command what do you get

Code: Select all

su nagios
/home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf
echo $?
ls -al /home/nagios/logfile_basedir/conf/PSP_linux.conf
Below is the output of the command as if now i am facing this kind of issue with most of the loge-based services for two of the nodes.

[nagios@dnrbta ~]$ /home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf
OK - no errors or warnings|Power_Supply_lines=0 Power_Supply_warnings=0 Power_Supply_criticals=0 Power_Supply_unknowns=0
[nagios@dnrbta ~]$ echo $?
0
[nagios@dnrbta ~]$ ls -al /home/nagios/logfile_basedir/conf/PSP_linux.conf
-rw-r--r-- 1 nagios nagios 523 Jul 7 2016 /home/nagios/logfile_basedir/conf/PSP_linux.conf
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Fake Alert

Post by scottwilkerson »

This is weird because it seems to be working intermittently.

Does you client have a limit to how many times a user can be logged in?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
ericssonvietnam
Posts: 239
Joined: Mon Jun 27, 2016 11:05 pm

Re: Fake Alert

Post by ericssonvietnam »

scottwilkerson wrote:This is weird because it seems to be working intermittently.

Does you client have a limit to how many times a user can be logged in?
Am also suspecting such kind of issue but not sure which parameters i have to check for this.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Fake Alert

Post by scottwilkerson »

ericssonvietnam wrote:Am also suspecting such kind of issue but not sure which parameters i have to check for this.
I've never modified this on a system but here are some clues

https://unix.stackexchange.com/question ... s-per-user
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
ericssonvietnam
Posts: 239
Joined: Mon Jun 27, 2016 11:05 pm

Re: Fake Alert

Post by ericssonvietnam »

scottwilkerson wrote:
ericssonvietnam wrote:Am also suspecting such kind of issue but not sure which parameters i have to check for this.
I've never modified this on a system but here are some clues

https://unix.stackexchange.com/question ... s-per-user
Hi Team,

As checked, there is no default limit on the node for nagios server for ssh sessions.

5/17/2018 16:27 DNRBTB Hardware issue 1 CRITICAL HARD 1 1 OK OK CRITICAL - Plugin timed out after 60 seconds


Logs :
May 17 16:27:00 DNRBTB sshd[5335]: Connection from 10.10.164.52 port 43009
May 17 16:27:00 DNRBTB sshd[5335]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 17 16:27:00 DNRBTB sshd[5336]: Postponed publickey for nagios from 10.10.164.52 port 43009 ssh2
May 17 16:27:00 DNRBTB sshd[5335]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 17 16:27:00 DNRBTB sshd[5335]: Accepted publickey for nagios from 10.10.164.52 port 43009 ssh2
May 17 16:27:00 DNRBTB sshd[5335]: pam_unix(sshd:session): session opened for user nagios by (uid=0)
May 17 16:27:00 DNRBTB sshd[5337]: Received disconnect from 10.10.164.52: 11: disconnected by user
May 17 16:27:00 DNRBTB sshd[5335]: pam_unix(sshd:session): session closed for user nagios



Again, i am observed the same thing in logs. Still i am getting the same alert on few of the node types.

Can you please assist me if we can find some other way out to resolve this issue.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Fake Alert

Post by scottwilkerson »

I have never seen this and cannot repliace it, the only thing I can suggest would be to change these hosts causing the problems to use a agent to perform the checks instead of the check_by_ssh
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked