Fake Alert
-
- Posts: 239
- Joined: Mon Jun 27, 2016 11:05 pm
Fake Alert
Hi Team,
I have checked and found that there was no issue on the server but i still got out of bound alert for the below node can you let me know what all things i can check from my end to get the solution for this.
Also i have observed similar issue for CPU usage as well.
CPU Usage
Critical
3d 6h 4m 33s
5/5
2018-05-11 17:05:38
CPU used 100.0% (>95) : CRITICAL
I have checked and found that there was no issue on the server but i still got out of bound alert for the below node can you let me know what all things i can check from my end to get the solution for this.
Also i have observed similar issue for CPU usage as well.
CPU Usage
Critical
3d 6h 4m 33s
5/5
2018-05-11 17:05:38
CPU used 100.0% (>95) : CRITICAL
You do not have the required permissions to view the files attached to this post.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Fake Alert
In order to diagnose the issue you are having we would need to know what plugin you are running and what arguments you are passing to it.
Can you attach the plugin along with the command that nagios is running that is returning these result?
If this is a NRPE call, we would need the plugin that is running on the remote system and the command.
Can you attach the plugin along with the command that nagios is running that is returning these result?
If this is a NRPE call, we would need the plugin that is running on the remote system and the command.
-
- Posts: 239
- Joined: Mon Jun 27, 2016 11:05 pm
Re: Fake Alert
scottwilkerson wrote:In order to diagnose the issue you are having we would need to know what plugin you are running and what arguments you are passing to it.
Can you attach the plugin along with the command that nagios is running that is returning these result?
If this is a NRPE call, we would need the plugin that is running on the remote system and the command.
Hi i am using the below two argument to generate the alert and running the below command with the help of the check_by_ssh plugin.
-C "/home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf"
-t 60 -o StrictHostKeyChecking=no -l nagios -E
Below are the logs from /var/log/secure Let me know if anything unusual observation.
May 15 03:01:26 dnrbta sshd[5996]: Connection from 10.10.164.52 port 40186
May 15 03:01:26 dnrbta sshd[5996]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:26 dnrbta sshd[5997]: Postponed publickey for nagios from 10.10.164.52 port 40186 ssh2
May 15 03:01:26 dnrbta sshd[5996]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:26 dnrbta sshd[5996]: Accepted publickey for nagios from 10.10.164.52 port 40186 ssh2
May 15 03:01:26 dnrbta sshd[5996]: pam_unix(sshd:session): session opened for user nagios by (uid=0)
May 15 03:01:27 dnrbta sshd[5998]: Received disconnect from 10.10.164.52: 11: disconnected by user
May 15 03:01:27 dnrbta sshd[5996]: pam_unix(sshd:session): session closed for user nagios
May 15 03:01:36 dnrbta sshd[6031]: Connection from 10.10.164.52 port 40222
May 15 03:01:36 dnrbta sshd[6031]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:36 dnrbta sshd[6032]: Postponed publickey for nagios from 10.10.164.52 port 40222 ssh2
May 15 03:01:36 dnrbta sshd[6031]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 15 03:01:36 dnrbta sshd[6031]: Accepted publickey for nagios from 10.10.164.52 port 40222 ssh2
May 15 03:01:36 dnrbta sshd[6031]: pam_unix(sshd:session): session opened for user nagios by (uid=0)
May 15 03:01:36 dnrbta sshd[6033]: Received disconnect from 10.10.164.52: 11: disconnected by user
May 15 03:01:36 dnrbta sshd[6031]: pam_unix(sshd:session): session closed for user nagios
May 15 03:01:39 dnrbta sshd[6058]: Connection from 10.10.164.52 port 40243
May 15 03:01:39 dnrbta sshd[6059]: Connection closed by 10.10.164.52
May 15 03:01:48 dnrbta sshd[6060]: Connection from 10.10.164.52 port 40284
May 15 03:01:48 dnrbta sshd[6061]: Connection closed by 10.10.164.52
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Fake Alert
On the remote computer, if you su to nagios and run this command what do you get
Code: Select all
su nagios
/home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf
echo $?
ls -al /home/nagios/logfile_basedir/conf/PSP_linux.conf
-
- Posts: 239
- Joined: Mon Jun 27, 2016 11:05 pm
Re: Fake Alert
Below is the output of the command as if now i am facing this kind of issue with most of the loge-based services for two of the nodes.scottwilkerson wrote:On the remote computer, if you su to nagios and run this command what do you get
Code: Select all
su nagios /home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf echo $? ls -al /home/nagios/logfile_basedir/conf/PSP_linux.conf
[nagios@dnrbta ~]$ /home/nagios/bin/check_logfiles -f /home/nagios/logfile_basedir/conf/PSP_linux.conf
OK - no errors or warnings|Power_Supply_lines=0 Power_Supply_warnings=0 Power_Supply_criticals=0 Power_Supply_unknowns=0
[nagios@dnrbta ~]$ echo $?
0
[nagios@dnrbta ~]$ ls -al /home/nagios/logfile_basedir/conf/PSP_linux.conf
-rw-r--r-- 1 nagios nagios 523 Jul 7 2016 /home/nagios/logfile_basedir/conf/PSP_linux.conf
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Fake Alert
This is weird because it seems to be working intermittently.
Does you client have a limit to how many times a user can be logged in?
Does you client have a limit to how many times a user can be logged in?
-
- Posts: 239
- Joined: Mon Jun 27, 2016 11:05 pm
Re: Fake Alert
Am also suspecting such kind of issue but not sure which parameters i have to check for this.scottwilkerson wrote:This is weird because it seems to be working intermittently.
Does you client have a limit to how many times a user can be logged in?
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Fake Alert
I've never modified this on a system but here are some cluesericssonvietnam wrote:Am also suspecting such kind of issue but not sure which parameters i have to check for this.
https://unix.stackexchange.com/question ... s-per-user
-
- Posts: 239
- Joined: Mon Jun 27, 2016 11:05 pm
Re: Fake Alert
Hi Team,scottwilkerson wrote:I've never modified this on a system but here are some cluesericssonvietnam wrote:Am also suspecting such kind of issue but not sure which parameters i have to check for this.
https://unix.stackexchange.com/question ... s-per-user
As checked, there is no default limit on the node for nagios server for ssh sessions.
5/17/2018 16:27 DNRBTB Hardware issue 1 CRITICAL HARD 1 1 OK OK CRITICAL - Plugin timed out after 60 seconds
Logs :
May 17 16:27:00 DNRBTB sshd[5335]: Connection from 10.10.164.52 port 43009
May 17 16:27:00 DNRBTB sshd[5335]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 17 16:27:00 DNRBTB sshd[5336]: Postponed publickey for nagios from 10.10.164.52 port 43009 ssh2
May 17 16:27:00 DNRBTB sshd[5335]: Found matching RSA key: 43:cc:08:f2:87:4e:89:09:03:e5:1c:f6:c2:38:0a:98
May 17 16:27:00 DNRBTB sshd[5335]: Accepted publickey for nagios from 10.10.164.52 port 43009 ssh2
May 17 16:27:00 DNRBTB sshd[5335]: pam_unix(sshd:session): session opened for user nagios by (uid=0)
May 17 16:27:00 DNRBTB sshd[5337]: Received disconnect from 10.10.164.52: 11: disconnected by user
May 17 16:27:00 DNRBTB sshd[5335]: pam_unix(sshd:session): session closed for user nagios
Again, i am observed the same thing in logs. Still i am getting the same alert on few of the node types.
Can you please assist me if we can find some other way out to resolve this issue.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Fake Alert
I have never seen this and cannot repliace it, the only thing I can suggest would be to change these hosts causing the problems to use a agent to perform the checks instead of the check_by_ssh