Issue setting up DNX
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Issue setting up DNX
This sounds like you are moving in the right direction
Re: Issue setting up DNX
Well, I was sure we would have this resolved by now. Here is where we are:
We show upd connectivity established on the 3 clients. netstat -aupn
[root@localhost ~]# netstat -aupn
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp 0 0 192.168.252.114:55499 192.168.252.113:12481 ESTABLISHED 1565/dnxClient
udp 0 0 192.168.252.114:45390 192.168.252.113:12481 ESTABLISHED 1565/dnxClient
udp 0 0 192.168.252.114:58322 192.168.252.113:12481 ESTABLISHED 1565/dnxClient
etc....
However, on the server (192.168.252.113) we do not show any udp connectivity to the clients. The Foreign Address is always 0.0.0.0.
[root@nagiosdeva ~]# netstat -aupn
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp 0 0 0.0.0.0:5353 0.0.0.0:* 1296/avahi-daemon
udp 0 0 192.168.252.113:123 0.0.0.0:* 1390/ntpd
udp 0 0 127.0.0.1:123 0.0.0.0:* 1390/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 1390/ntpd
udp 0 0 0.0.0.0:35200 0.0.0.0:* 1296/avahi-daemon
udp 0 0 0.0.0.0:162 0.0.0.0:* 1353/snmptrapd
udp 0 0 192.168.252.113:12480 0.0.0.0:* 1751/dnxServer
udp 0 0 192.168.252.113:12481 0.0.0.0:* 1751/dnxServer
udp 0 0 192.168.252.113:12482 0.0.0.0:* 1751/dnxServer
Following is the repeating error from the server for each check:
[Fri May 25 16:26:33.69 2012] Allocating node request.
[Fri May 25 16:26:33.72 2012] Post failed: Resource was not found. Service check [3974] will execute locally: /usr/local/nagios/libexec/check_icmp -H 148.80.26.37 -w 3000.0,80% -c 5000.0,100% -p 5.
[Fri May 25 16:26:36.153 2012] Reaper handler called.
[Fri May 25 16:26:39.119 2012] Reaper handler called.
It's my belief that there is some type of network connectivity problem on the server.
We show upd connectivity established on the 3 clients. netstat -aupn
[root@localhost ~]# netstat -aupn
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp 0 0 192.168.252.114:55499 192.168.252.113:12481 ESTABLISHED 1565/dnxClient
udp 0 0 192.168.252.114:45390 192.168.252.113:12481 ESTABLISHED 1565/dnxClient
udp 0 0 192.168.252.114:58322 192.168.252.113:12481 ESTABLISHED 1565/dnxClient
etc....
However, on the server (192.168.252.113) we do not show any udp connectivity to the clients. The Foreign Address is always 0.0.0.0.
[root@nagiosdeva ~]# netstat -aupn
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
udp 0 0 0.0.0.0:5353 0.0.0.0:* 1296/avahi-daemon
udp 0 0 192.168.252.113:123 0.0.0.0:* 1390/ntpd
udp 0 0 127.0.0.1:123 0.0.0.0:* 1390/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 1390/ntpd
udp 0 0 0.0.0.0:35200 0.0.0.0:* 1296/avahi-daemon
udp 0 0 0.0.0.0:162 0.0.0.0:* 1353/snmptrapd
udp 0 0 192.168.252.113:12480 0.0.0.0:* 1751/dnxServer
udp 0 0 192.168.252.113:12481 0.0.0.0:* 1751/dnxServer
udp 0 0 192.168.252.113:12482 0.0.0.0:* 1751/dnxServer
Following is the repeating error from the server for each check:
[Fri May 25 16:26:33.69 2012] Allocating node request.
[Fri May 25 16:26:33.72 2012] Post failed: Resource was not found. Service check [3974] will execute locally: /usr/local/nagios/libexec/check_icmp -H 148.80.26.37 -w 3000.0,80% -c 5000.0,100% -p 5.
[Fri May 25 16:26:36.153 2012] Reaper handler called.
[Fri May 25 16:26:39.119 2012] Reaper handler called.
It's my belief that there is some type of network connectivity problem on the server.
Re: Issue setting up DNX
Can you check to see if SELinux is disabled, and if it's not, go ahead and disable it.
If that fixes the issue, go ahead and update the /etc/selinux/config file to permanently update the settings.
Code: Select all
setenforce 0
Re: Issue setting up DNX
The setenforce option was already disabled.
Re: Issue setting up DNX
Progress: We believe the primary problem was due to the iptables on the server was blocking the UDP data. When we stop the iptables, it resolved our problems. Not exactly sure what was going on there but I will not argue with success. Following are a few lines from the audit file that confirms our activity.
[root@nagiosdeva log]# tail -f dnxsrv.audit.log
[Tue May 29 12:34:24.651 2012] ASSIGN: Job 461: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.5 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.686 2012] ASSIGN: Job 462: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.31 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.704 2012] ASSIGN: Job 463: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.36 -w 92 -c 97
[Tue May 29 12:34:24.719 2012] DISPATCH: Job 461: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.5 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.756 2012] DISPATCH: Job 462: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.31 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.804 2012] DISPATCH: Job 463: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.36 -w 92 -c 97
[Tue May 29 12:34:25.992 2012] COLLECT: Job 460: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.43 -w 92 -c 97
[Tue May 29 12:34:26.188 2012] COLLECT: Job 463: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.36 -w 92 -c 97
[Tue May 29 12:34:27.540 2012] COLLECT: Job 461: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.5 -w 8 -c 14 -t 30
[Tue May 29 12:34:28.516 2012] COLLECT: Job 462: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.31 -w 8 -c 14 -t 30
I have another minor problem but I will start a new thread for it. Thanks for everyone's assistance.
[root@nagiosdeva log]# tail -f dnxsrv.audit.log
[Tue May 29 12:34:24.651 2012] ASSIGN: Job 461: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.5 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.686 2012] ASSIGN: Job 462: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.31 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.704 2012] ASSIGN: Job 463: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.36 -w 92 -c 97
[Tue May 29 12:34:24.719 2012] DISPATCH: Job 461: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.5 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.756 2012] DISPATCH: Job 462: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.31 -w 8 -c 14 -t 30
[Tue May 29 12:34:24.804 2012] DISPATCH: Job 463: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.36 -w 92 -c 97
[Tue May 29 12:34:25.992 2012] COLLECT: Job 460: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.43 -w 92 -c 97
[Tue May 29 12:34:26.188 2012] COLLECT: Job 463: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_disk_remote -e ssh -H 148.80.26.36 -w 92 -c 97
[Tue May 29 12:34:27.540 2012] COLLECT: Job 461: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.5 -w 8 -c 14 -t 30
[Tue May 29 12:34:28.516 2012] COLLECT: Job 462: Worker 192.168.252.114-72fca8c0: /usr/local/nagios/libexec/check_cpu_remote -e ssh -H 148.80.26.31 -w 8 -c 14 -t 30
I have another minor problem but I will start a new thread for it. Thanks for everyone's assistance.