problem with localhost/ nsswitch.conf concall request

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

problem with localhost/ nsswitch.conf concall request

Post by benhank »

So we reciently had an outage that cause the issue described here:

Code: Select all

https://support.nagios.com/forum/viewtopic.php?f=16&t=49889&start=10
As stated in that post a venor(boo) made a change in dns that caused any server /software configured to point to "localhost" to become unreachable because localhost no longer pointed to 127.0.0.1

The solution to the problem was to modify the nsswitch.conf file to point to etc/hosts and then dns:

Code: Select all

 vi /etc/nsswitch.conf (change the hosts line to “hosts files dns”, line is in red below)
#
# /etc/nsswitch.conf
#
# An example Name Service Switch config file. This file should be
# sorted with the most-used services at the beginning.
#
# The entry '[NOTFOUND=return]' means that the search for an
# entry should stop if the search in the previous entry turned
# up nothing. Note that if the search failed due to some other reason
# (like no NIS server responding) then the search continues with the
# next entry.
#
# Valid entries include:
#
#       nisplus                 Use NIS+ (NIS version 3)
#       nis                     Use NIS (NIS version 2), also called YP
#       dns                     Use DNS (Domain Name Service)
#       files                   Use the local files
#       db                      Use the local database (.db) files
#       compat                  Use NIS on compat mode
#       hesiod                  Use Hesiod for user lookups
#       [NOTFOUND=return]       Stop searching if not found so far
#

# To use db, put the "db" in front of "files" for entries you want to be
# looked up first in the databases
#
# Example:
#passwd:    db files nisplus nis
#shadow:    db files nisplus nis
#group:     db files nisplus nis

passwd:     files
shadow:     files
group:      files

#hosts:     db files nisplus nis dns
hosts:  dns files

# Example - obey only what nisplus tells us...
#services:   nisplus [NOTFOUND=return] files
#networks:   nisplus [NOTFOUND=return] files
#protocols:  nisplus [NOTFOUND=return] files
#rpc:        nisplus [NOTFOUND=return] files
#ethers:     nisplus [NOTFOUND=return] files
#netmasks:   nisplus [NOTFOUND=return] files

bootparams: nisplus [NOTFOUND=return] files

ethers:     files
netmasks:   files
networks:   files
protocols:  files
rpc:        files
services:   files

netgroup:   nisplus

publickey:  nisplus

automount:  files nisplus
aliases:    files nisplus
This has worked on every centos box we have except my XI production server.
When making this change on my XI prod server the following happens:

Code: Select all

pg_pconnect(): Unable to connect to PostgreSQL server: could not connect to server: Connection timed out
        Is the server running on host "localhost" and accepting
        TCP/IP connections on port 5432? in /usr/local/nagiosxi/html/db/adodb/drivers/adodb-postgres64.inc.php on line 699
<h3>Databse Error</h3>A database connection error has been detected, please follow the repair prompt below. If the issue persists, please contact Nagios support.<p>Run the following from the CLI as root to attempt to repair the DB:<br><pre>/usr/local/nagiosxi/scripts/repair_databases.sh</pre></p>PHP Warning:  pg_pconnect(): Unable to connect to PostgreSQL server: could not connect to server: Connection timed out
        Is the server running on host "localhost" and accepting
        TCP/IP connections on port 5432? in /usr/local/nagiosxi/html/db/adodb/drivers/adodb-postgres64.inc.php on line 699
Although my prod server is running with the incorrect configuration, I have to be able to change the nsswitch.conf file as listed above and part of a new policy to prevent this issue in the future.
The only difference between on my prod server is that it has an offloaded mysql database, which isn't affected when we make the change.
I am requesting a concall with my Nagios support so that we can along with my Linux ad mind figure out how to resolve this without taking down the Prod server.
Thanks guys!
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: problem with localhost/ nsswitch.conf concall request

Post by ssax »

I know that there used to be a bug in the centos hostname configuration utility where it would remove the localhost entry from /etc/hosts when you ran the command. Does your XI server have localhost on 127.0.0.1 currently?

Code: Select all

[root@xig nagiosxi]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: problem with localhost/ nsswitch.conf concall request

Post by benhank »

hmmm...

no mine dont look like dat mon....I have hostnames of the server in there ill make the change on monday and see what happens
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: problem with localhost/ nsswitch.conf concall request

Post by benhank »

Good news! I made the changes to the hosts files and its all working properly! Thanks guys!
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: problem with localhost/ nsswitch.conf concall request

Post by ssax »

Glad to hear it, are we okay to lock this and mark it as resolved?
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: problem with localhost/ nsswitch.conf concall request

Post by benhank »

yep!
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
Locked