Issue with EC2 Nagios AMI
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Issue with EC2 Nagios AMI
Hi
I am having exactly the same problems with trying to login via ssh as ec2-user to a NagiosXI instance that I've created in our VPC.
I keep getting "Permission denied (publickey,gssapi-keyex,gssapi-with-mic)." when I try to connect using my Key-pair.
I know that the key-pair works because I have created a standard Linux instance and the credentials work fine for the ec2-user.
I can understand that the http interface is not working as it is trying to update the NagiosXI app but why should that affect the ability to ssh to the instance?
Our VPC won't allow intenet access, so I am trying to get that fixed to allow the updates to proceed. However I am stuck with not being able to ssh into the instance.
regards... Fred
OK I can now download the updated nagios file after the installation but it fails at the start when it runs it
Still can't ssh to the instance
I am having exactly the same problems with trying to login via ssh as ec2-user to a NagiosXI instance that I've created in our VPC.
I keep getting "Permission denied (publickey,gssapi-keyex,gssapi-with-mic)." when I try to connect using my Key-pair.
I know that the key-pair works because I have created a standard Linux instance and the credentials work fine for the ec2-user.
I can understand that the http interface is not working as it is trying to update the NagiosXI app but why should that affect the ability to ssh to the instance?
Our VPC won't allow intenet access, so I am trying to get that fixed to allow the updates to proceed. However I am stuck with not being able to ssh into the instance.
regards... Fred
OK I can now download the updated nagios file after the installation but it fails at the start when it runs it
Still can't ssh to the instance
You do not have the required permissions to view the files attached to this post.
Re: Issue with EC2 Nagios AMI
Is port 22 open on the server?
Code: Select all
nmap <fqdn hostname of nagios server> -p 22 Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Issue with EC2 Nagios AMI
Yes - I can get the login prompt - it just doesn't authenticate using the ec2-user credentials with my key pair
I am launching the NagiosXI AMI and everything seems to work at start up. You can see the contents of the log above that the updated nagios tar file has been downloaded and it is running the upgrade - however it seems to loop continuously at 0-repos. I suspect that the disk is now full as it no longer responds to ssh requests.
Code: Select all
# nmap 54.66.211.253 -p 22 # Public IP
Starting Nmap 5.51 ( http://nmap.org ) at 2014-11-18 08:11 WST
Nmap scan report for ec2-54-66-211-253.ap-southeast-2.compute.amazonaws.com (54.66.211.253)
Host is up (0.060s latency).
PORT STATE SERVICE
22/tcp open ssh
Nmap done: 1 IP address (1 host up) scanned in 0.49 seconds
Code: Select all
# ssh -i ASG-Service-Enablement-Default-Key-Pair.pem [email protected]
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).Code: Select all
$ ssh -i .ssh/ASG-Service-Enablement-Default-Key-Pair.pem [email protected] # Private IP - run from another linux server in my VPC
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Issue with EC2 Nagios AMI
Researching this further - it appears that the particular error I am seeing with not being able "to retrieve metalink for repository: epel" is usually caused by the system clock not being correct
https://www.centos.org/forums/viewtopic.php?t=1420
Could this also be causing my ssh problems?
I am launching " Nagios XI CentOS x86_64 (ami-c78018fd) " in the ap-southeast-2a zone
Further info....
I Created a test Linux instance using the same key-pair - can ssh OK to that as ec2-user so that proves the key-pair we are using is correct.
I then detached the volume from the NAgios insance and mounted it under the test linux instance. Checked ~ec2-user home directory and there is no .ssh directory under it.
Is there something missing in the launch process?
https://www.centos.org/forums/viewtopic.php?t=1420
Could this also be causing my ssh problems?
I am launching " Nagios XI CentOS x86_64 (ami-c78018fd) " in the ap-southeast-2a zone
Further info....
I Created a test Linux instance using the same key-pair - can ssh OK to that as ec2-user so that proves the key-pair we are using is correct.
I then detached the volume from the NAgios insance and mounted it under the test linux instance. Checked ~ec2-user home directory and there is no .ssh directory under it.
Is there something missing in the launch process?
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Issue with EC2 Nagios AMI
You're not going to be able to deploy our ami instance into a vpc without internet access. The install intentionally blocks user logon until after the XI install has been completed. Part of our expectation with cloud services, is that upon install they will have internet. Once the install is complete, your ssh keys are added to the instances and login is allowed.
I can only think of two options in this particular case. Either you will have to spin up the instance out side of your vpc and migrate it in once the install is complete, or you will have to start with a base cent system and do some form of restricted internet install. I think the first option would be the clear winner for me, I'm not so sure how offline installations would work in the cloud.
I can only think of two options in this particular case. Either you will have to spin up the instance out side of your vpc and migrate it in once the install is complete, or you will have to start with a base cent system and do some form of restricted internet install. I think the first option would be the clear winner for me, I'm not so sure how offline installations would work in the cloud.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Issue with EC2 Nagios AMI
Thanks Spenser
I've been told that currently this instance does have internet access. Previously when it didn't, it failed when it was trying to get the latest nagios tar file.
Now it appears to have downloaded the tar file and is running the install script, so I am assuming that it does have internet access.
It is now failing at 0-repos with the metalink error and continually looping at that point. Do you know how I can get past this?
Or alternatively can the creation of the ssh keys be done first so that I can at least login to the instance?
UPDATE
Looking at other suggestions for the metalink error on this forum one suggestion is to change 'mirrorlist=https' to 'mirrorlist=http'in the epel repo file.
So.... I have detached the volume from the Nagios instance, and mounted it on another linux instance. I made the change to the epel repo file , unmounted from the test instance and attached it back again to the Nagios instance. Restarted the Nagios instance and it now installs Nagios correctly plus I can ssh to it using my keys.
Woohoo!
Moving forward - This is not a permanent solution as there is something that is preventing https access with the epel repos. Other solutions suggested that if the system clock is not correct then https won't work. Perhaps running a ntpdate at the start of the install would solve this ? Either that or you update your AMI with the above changes.
Regards... Fred
I've been told that currently this instance does have internet access. Previously when it didn't, it failed when it was trying to get the latest nagios tar file.
Now it appears to have downloaded the tar file and is running the install script, so I am assuming that it does have internet access.
It is now failing at 0-repos with the metalink error and continually looping at that point. Do you know how I can get past this?
Or alternatively can the creation of the ssh keys be done first so that I can at least login to the instance?
UPDATE
Looking at other suggestions for the metalink error on this forum one suggestion is to change 'mirrorlist=https' to 'mirrorlist=http'in the epel repo file.
So.... I have detached the volume from the Nagios instance, and mounted it on another linux instance. I made the change to the epel repo file , unmounted from the test instance and attached it back again to the Nagios instance. Restarted the Nagios instance and it now installs Nagios correctly plus I can ssh to it using my keys.
Woohoo!
Moving forward - This is not a permanent solution as there is something that is preventing https access with the epel repos. Other solutions suggested that if the system clock is not correct then https won't work. Perhaps running a ntpdate at the start of the install would solve this ? Either that or you update your AMI with the above changes.
Regards... Fred
Last edited by Fred Kroeger on Tue Nov 18, 2014 6:05 pm, edited 1 time in total.
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Issue with EC2 Nagios AMI
We can't fix the ssh key issues without recreating the entire instance for everyone and letting it replicate world wide, so not too much of an option(at least for the moment). Are you still failing for the yum request post cent continuous release? Does this server maybe have a proxy that yum needs to go through? This would almost definitely be the first time the script touches yum, so if we need to redirect yum through a proxy it would seem to make sense to me. It's definitely something that needs to be worked on(from our end), but for now if you can mount that drive on another instance and fix yum, if there is a proxy it needs, then reapply to the correct instance and startup. The install might go through. Wget and curl are the only other two things that would access internet, so if they are happy it's just up to yum.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Issue with EC2 Nagios AMI
Hi Spenser
ssh keys is not really a problem as long as the Nagios install completes correctly. It's when it doesn't that you can't login to fix anything.
At the moment if I change the epel repo mirrorlist line to use http - everything works and Nagios installs correctly - so perhaps that is where the focus for fixing this should be.
Fred
ssh keys is not really a problem as long as the Nagios install completes correctly. It's when it doesn't that you can't login to fix anything.
At the moment if I change the epel repo mirrorlist line to use http - everything works and Nagios installs correctly - so perhaps that is where the focus for fixing this should be.
Fred
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Issue with EC2 Nagios AMI
We actually just found that out today on one of our internal test systems, and are looking into it. Not sure if epel changed, our version of epel is now incorrect, or what might be happening there, but we are looking at it!
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Issue with EC2 Nagios AMI
Hi Spenser
I edited /etc/yum.repos.d/epel.repo and changed every line that contained
mirrorlist=https://xxxx
to mirrorlist=http://xxx
Of course because the ec2-user credentials weren't created I could only do this by detaching the root volume and attaching it to another linux server to edit the epel.repo file
I then re-attached the root volume to the NagisoXI instance and rebooted - the install then completed successfully as well as the creation of the ec2-user credentials.
regards... Fred
I edited /etc/yum.repos.d/epel.repo and changed every line that contained
mirrorlist=https://xxxx
to mirrorlist=http://xxx
Of course because the ec2-user credentials weren't created I could only do this by detaching the root volume and attaching it to another linux server to edit the epel.repo file
I then re-attached the root volume to the NagisoXI instance and rebooted - the install then completed successfully as well as the creation of the ec2-user credentials.
regards... Fred