Permissions issues after migration
Posted: Mon Nov 27, 2023 10:44 pm
Hey All,
I hope someone can help me because Im really battling with my productions Nagios instance.
Some background:
We have a main enterprise XI and a couple of satellite XI's that feed into it. The satellite checks are all ok and all checks that were created on NCPA are also ok.
What happened was that we migrated from CentOS7 to Ubuntu server 20.04 by following the Nagios migration guides. All of the steps were completed including the repairs and reinstallation of a few plugins with architecture differences from Cent/Ubuntu. I expected there to be problems with old checks using NRPE/NSClient++ so created duplicate checks for all services in NCPA before migrating.
Now for the actual issue:
There are a number of permissions issues and weird behaviours going on with the new Ubuntu server after migration. The new/old servers are both on 5.8.6 as the versions needed to be the same to migrate.
I believe everything is related to permissions and most likely the same cause but Im not sure how to fix them or what they should be and because its prod I dont want to fiddle too much.
Component status shows everyhint but Monitoring Engine and Performance Graphers as down: The monitoring engine is working perfectly and I get system graphs too. I have noted that some configuration wizards will complete as successful but they dont write into the CCM or the OS. Similarly I cant change anything in CCM because when applying configs it just runs the ..... forever and doesnt complete. Same thing when I run a forced check on a service the command will timeout.
What contradicts all of that is we have some custom quick commands that call bash scripts in /usr/local/nagios/libexec and those are working from the web frontend with quick actions!
Can anyone think of something I can try because Im pulling my hair out here. The main two things I need to get working asap are
1. Write configs from CCM and Wizards
2. Fix the System Compenents that are showing red.
Any assistance is really appreciated!
I hope someone can help me because Im really battling with my productions Nagios instance.
Some background:
We have a main enterprise XI and a couple of satellite XI's that feed into it. The satellite checks are all ok and all checks that were created on NCPA are also ok.
What happened was that we migrated from CentOS7 to Ubuntu server 20.04 by following the Nagios migration guides. All of the steps were completed including the repairs and reinstallation of a few plugins with architecture differences from Cent/Ubuntu. I expected there to be problems with old checks using NRPE/NSClient++ so created duplicate checks for all services in NCPA before migrating.
Now for the actual issue:
There are a number of permissions issues and weird behaviours going on with the new Ubuntu server after migration. The new/old servers are both on 5.8.6 as the versions needed to be the same to migrate.
I believe everything is related to permissions and most likely the same cause but Im not sure how to fix them or what they should be and because its prod I dont want to fiddle too much.
Component status shows everyhint but Monitoring Engine and Performance Graphers as down: The monitoring engine is working perfectly and I get system graphs too. I have noted that some configuration wizards will complete as successful but they dont write into the CCM or the OS. Similarly I cant change anything in CCM because when applying configs it just runs the ..... forever and doesnt complete. Same thing when I run a forced check on a service the command will timeout.
What contradicts all of that is we have some custom quick commands that call bash scripts in /usr/local/nagios/libexec and those are working from the web frontend with quick actions!
Can anyone think of something I can try because Im pulling my hair out here. The main two things I need to get working asap are
1. Write configs from CCM and Wizards
2. Fix the System Compenents that are showing red.
Any assistance is really appreciated!