Page 1 of 2
Issue with EC2 Nagios AMI
Posted: Sun Nov 16, 2014 10:21 pm
by Fred Kroeger
Hi
I am having exactly the same problems with trying to login via ssh as ec2-user to a NagiosXI instance that I've created in our VPC.
I keep getting "Permission denied (publickey,gssapi-keyex,gssapi-with-mic)." when I try to connect using my Key-pair.
I know that the key-pair works because I have created a standard Linux instance and the credentials work fine for the ec2-user.
I can understand that the http interface is not working as it is trying to update the NagiosXI app but why should that affect the ability to ssh to the instance?
Our VPC won't allow intenet access, so I am trying to get that fixed to allow the updates to proceed. However I am stuck with not being able to ssh into the instance.
regards... Fred
OK I can now download the updated nagios file after the installation but it fails at the start when it runs it
Still can't ssh to the instance
Re: Issue with EC2 Nagios AMI
Posted: Mon Nov 17, 2014 5:34 pm
by abrist
Is port 22 open on the server?
Code: Select all
nmap <fqdn hostname of nagios server> -p 22
Re: Issue with EC2 Nagios AMI
Posted: Mon Nov 17, 2014 7:21 pm
by Fred Kroeger
Yes - I can get the login prompt - it just doesn't authenticate using the ec2-user credentials with my key pair
Code: Select all
# nmap 54.66.211.253 -p 22 # Public IP
Starting Nmap 5.51 ( http://nmap.org ) at 2014-11-18 08:11 WST
Nmap scan report for ec2-54-66-211-253.ap-southeast-2.compute.amazonaws.com (54.66.211.253)
Host is up (0.060s latency).
PORT STATE SERVICE
22/tcp open ssh
Nmap done: 1 IP address (1 host up) scanned in 0.49 seconds
Code: Select all
# ssh -i ASG-Service-Enablement-Default-Key-Pair.pem [email protected]
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Code: Select all
$ ssh -i .ssh/ASG-Service-Enablement-Default-Key-Pair.pem [email protected] # Private IP - run from another linux server in my VPC
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
I am launching the NagiosXI AMI and everything seems to work at start up. You can see the contents of the log above that the updated nagios tar file has been downloaded and it is running the upgrade - however it seems to loop continuously at 0-repos. I suspect that the disk is now full as it no longer responds to ssh requests.
Re: Issue with EC2 Nagios AMI
Posted: Mon Nov 17, 2014 8:22 pm
by Fred Kroeger
Researching this further - it appears that the particular error I am seeing with not being able "to retrieve metalink for repository: epel" is usually caused by the system clock not being correct
https://www.centos.org/forums/viewtopic.php?t=1420
Could this also be causing my ssh problems?
I am launching " Nagios XI CentOS x86_64 (ami-c78018fd) " in the ap-southeast-2a zone
Further info....
I Created a test Linux instance using the same key-pair - can ssh OK to that as ec2-user so that proves the key-pair we are using is correct.
I then detached the volume from the NAgios insance and mounted it under the test linux instance. Checked ~ec2-user home directory and there is no .ssh directory under it.
Is there something missing in the launch process?
Re: Issue with EC2 Nagios AMI
Posted: Tue Nov 18, 2014 2:03 pm
by sreinhardt
You're not going to be able to deploy our ami instance into a vpc without internet access. The install intentionally blocks user logon until after the XI install has been completed. Part of our expectation with cloud services, is that upon install they will have internet. Once the install is complete, your ssh keys are added to the instances and login is allowed.
I can only think of two options in this particular case. Either you will have to spin up the instance out side of your vpc and migrate it in once the install is complete, or you will have to start with a base cent system and do some form of restricted internet install. I think the first option would be the clear winner for me, I'm not so sure how offline installations would work in the cloud.
Re: Issue with EC2 Nagios AMI
Posted: Tue Nov 18, 2014 4:34 pm
by Fred Kroeger
Thanks Spenser
I've been told that currently this instance does have internet access. Previously when it didn't, it failed when it was trying to get the latest nagios tar file.
Now it appears to have downloaded the tar file and is running the install script, so I am assuming that it does have internet access.
It is now failing at 0-repos with the metalink error and continually looping at that point. Do you know how I can get past this?
Or alternatively can the creation of the ssh keys be done first so that I can at least login to the instance?
UPDATE
Looking at other suggestions for the metalink error on this forum one suggestion is to change 'mirrorlist=https' to 'mirrorlist=http'in the epel repo file.
So.... I have detached the volume from the Nagios instance, and mounted it on another linux instance. I made the change to the epel repo file , unmounted from the test instance and attached it back again to the Nagios instance. Restarted the Nagios instance and it now installs Nagios correctly plus I can ssh to it using my keys.
Woohoo!
Moving forward - This is not a permanent solution as there is something that is preventing https access with the epel repos. Other solutions suggested that if the system clock is not correct then https won't work. Perhaps running a ntpdate at the start of the install would solve this ? Either that or you update your AMI with the above changes.
Regards... Fred
Re: Issue with EC2 Nagios AMI
Posted: Tue Nov 18, 2014 5:57 pm
by sreinhardt
We can't fix the ssh key issues without recreating the entire instance for everyone and letting it replicate world wide, so not too much of an option(at least for the moment). Are you still failing for the yum request post cent continuous release? Does this server maybe have a proxy that yum needs to go through? This would almost definitely be the first time the script touches yum, so if we need to redirect yum through a proxy it would seem to make sense to me. It's definitely something that needs to be worked on(from our end), but for now if you can mount that drive on another instance and fix yum, if there is a proxy it needs, then reapply to the correct instance and startup. The install might go through. Wget and curl are the only other two things that would access internet, so if they are happy it's just up to yum.
Re: Issue with EC2 Nagios AMI
Posted: Tue Nov 18, 2014 6:28 pm
by Fred Kroeger
Hi Spenser
ssh keys is not really a problem as long as the Nagios install completes correctly. It's when it doesn't that you can't login to fix anything.
At the moment if I change the epel repo mirrorlist line to use http - everything works and Nagios installs correctly - so perhaps that is where the focus for fixing this should be.
Fred
Re: Issue with EC2 Nagios AMI
Posted: Wed Nov 19, 2014 4:56 pm
by sreinhardt
We actually just found that out today on one of our internal test systems, and are looking into it. Not sure if epel changed, our version of epel is now incorrect, or what might be happening there, but we are looking at it!
Re: Issue with EC2 Nagios AMI
Posted: Wed Nov 19, 2014 10:12 pm
by Fred Kroeger
Hi Spenser
I edited /etc/yum.repos.d/epel.repo and changed every line that contained
mirrorlist=
https://xxxx
to mirrorlist=
http://xxx
Of course because the ec2-user credentials weren't created I could only do this by detaching the root volume and attaching it to another linux server to edit the epel.repo file
I then re-attached the root volume to the NagisoXI instance and rebooted - the install then completed successfully as well as the creation of the ec2-user credentials.
regards... Fred