Page 1 of 2

How to troubleshoot webinject?

Posted: Tue May 28, 2013 3:48 pm
by vivithemage
I am trying to resolve why a bunch of our monitored sites are failing webinject test case 1, and the thing I see in the nagios.los is:

Code: Select all

ndomod: Still unable to connect to data sink. 148730520 items lost, 5000 queued items to flush.
Is this an issue? Or something that is probably unrelated?

Either way, where can I dive into what is failing, and possibly determining what the tests are failing?

Re: How to troubleshoot webinject?

Posted: Tue May 28, 2013 3:58 pm
by lmiltchev
Try running the following commands in this particular order:

Code: Select all

service nagios stop
killall -9 nagios
service ndo2db stop
service ndo2db start
service nagios start
Hope this helps.

Re: How to troubleshoot webinject?

Posted: Tue May 28, 2013 4:13 pm
by vivithemage
ndo2db did not exist, but mysqld did, so I stopped nagios, kill -9' nagios services, and then spun up mysqld and then nagios.

I still get:

Code: Select all

[1369775572] ndomod: Still unable to connect to data sink.  12779 items lost, 5000 queued items to flush.
Is it possible this is the issue? Why some sites are reporting failed web injects?

Re: How to troubleshoot webinject?

Posted: Tue May 28, 2013 4:26 pm
by sreinhardt
ndo2db would not be an issue with the checks themselves however it is definitely an issue. Just to verify, are you certain you are using Nagios Core and not Nagios XI? It seems odd that you would have mysql and the ndomod without the ndo service. Did you follow an installation guide or something else that you can link us to, to better understand your setup.

Re: How to troubleshoot webinject?

Posted: Tue May 28, 2013 4:27 pm
by abrist
What broker are you using?

Code: Select all

grep broker /usr/local/nagios/etc/nagios.cfg
Are you sure your database credentials are correct?
Also, try restarting ndo2db after you restart nagios.

Code: Select all

service nagios restart
service ndo2db restart
What version of nagios are you running?

Re: How to troubleshoot webinject?

Posted: Tue May 28, 2013 4:29 pm
by vivithemage
[root@vmmgtnagios rnj]# grep broker /usr/local/nagios/etc/nagios.cfg
# Controls what (if any) data gets sent to the event broker.
event_broker_options=-1
# This directive is used to specify an event broker module that should
# broker_module=<modulepath> [moduleargs]
#broker_module=/somewhere/module1.o
#broker_module=/somewhere/module2.o arg1 arg2=3 debug=0
# 64 = Event broker
broker_module=/usr/local/nagios/bin/ndomod-3x.o config_file=/usr/local/nagios/etc/ndomod.cfg
[root@vmmgtnagios rnj]#



the service ndo2db does not exist, which is why I chose to restart mysqld instead...as it looks like the only db server on this box. I never set this box up, just trying to fix/troubleshoot oddities on it now. I don't know if credentials are correct ...

NagiosĀ® Coreā„¢ 3.4.1

Re: How to troubleshoot webinject?

Posted: Tue May 28, 2013 4:44 pm
by sreinhardt
how about trying these. If one set of three works, do not use the other.

Code: Select all

service ndoutils stop
killall -9 ndo2db
service ndoutils start

/etc/init.d/ndoutils stop
killall -9 ndo2db
/etc/init.d/ndoutils start

Re: How to troubleshoot webinject?

Posted: Wed May 29, 2013 9:10 am
by vivithemage
took some digging, but I figured out this issue, fixed the sink errors, but I am still seeing false alerts, anything I can do to debug as to why they're falsley alerting, and to what is failing a check?

Also, is it possible to check if io/cpu/memory is an issue?

Re: How to troubleshoot webinject?

Posted: Wed May 29, 2013 3:56 pm
by sreinhardt
We can certainly look at load issues. Try the commands below:

Code: Select all

ps ax | wc -l

ulimit -a

cat /proc/loadavg

cat /proc/sys/kernel/threads-max

grep -i rlimit /usr/local/apache/conf/httpd.conf

uptime

free -m

iostat -x
Could you post a few webinject queries that are being used, you can obfuscate any exact URIs, usernames, and passwords if they are listed. It would also be helpful to have an example of what is listed as output from running these same checks.

Re: How to troubleshoot webinject?

Posted: Thu May 30, 2013 1:55 pm
by vivithemage
[root@vmmgtnagios ~]# ps ax | wc -l
197
[root@vmmgtnagios ~]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 73728
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 8192
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 73728
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[root@vmmgtnagios ~]# cat /proc/loadavg
0.67 0.35 0.22 1/310 21381
[root@vmmgtnagios ~]# cat /proc/sys/kernel/threads-max
147456
[root@vmmgtnagios ~]# uptime
13:46:32 up 1 day, 20:50, 1 user, load average: 0.52, 0.33, 0.22
[root@vmmgtnagios ~]# free -m
total used free shared buffers cached
Mem: 7983 4288 3695 0 306 3415
-/+ buffers/cache: 566 7417
Swap: 2047 0 2047
[root@vmmgtnagios ~]# grep -i rlimit /usr/local/apache/conf/httpd.conf
grep: /usr/local/apache/conf/httpd.conf: No such file or directory


This is my services.cfg located in: /u01/app/nagios/etc/objects

Code: Select all

define service {
  name linux_memory
  check_command check_snmp_mem_v1!-N!95,60!99,90
}
define command {
  name check_disk_all
  service_description Verification / /usr /var /u01
  check_command check_snmp_storage_v1!"^/$|usr|u01|var"!85!95!
}
define service {
   name linux_load
   check_command check_snmp_load_v1!netsl!4,3,3!8,5,5
}
Here's a snipet of what we call a pod from vi p001-p029.cfg, located in: /u01/app/nagios/etc/rnj

Code: Select all

## HOST AND JVM CHECKS FOR p001

define service{
    use                    generic-service
    host_name              p001_node02,p001_node01
    service_description    PING
    check_command          check_ping!3000.0,80%!5000.0,100%
    normal_check_interval  10
    retry_check_interval   1
}

define service{
    use                    generic-service
    host_name              p001_node02,p001_node01
    service_description    Check JVM
    check_command          check_tomcat!/RNJ!9080
    normal_check_interval  3
    retry_check_interval   1
}
define servicegroup{
        servicegroup_name       p001
        alias                   p001 GRC pod
        members                 p001_node01,Check JVM,p001_node02,Check JVM,p001_node01,client1000,p001_node02,client1000,p001_node01,client1001,p001_node02,client1001,p001_node01,client1002,p001_node02,client1002,p001_node01,client1003,p001_node02,client1003,p001_node01,client1004,p001_node02,client1004,p001_node01,client1005,p001_node02,client1005,p001_node01,client1006,p001_node02,client1006,p001_node01,client1007,p001_node02,client1007,p001_node01,client1008,p001_node02,client1008,p001_node01,client1009,p001_node02,client1009
        }
Is there something else you were looking for specifically related to webinject?