XI System Component Status - Issues
Re: XI System Component Status - Issues
The nagios.log file looks good. Every time an apply config is run, it will restart the Nagios process which is what you posted looks like. Do you have a lot of users on the system doing changes?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: XI System Component Status - Issues
Myself and a co-worker are the only ones making changes to the system right now. My co-worker is integrating our inventory system to NagiosXI to have new hosts and hostgroups added/updated/removed on a nightly bases. I do not know how often he is making changes or using Applied Config.
If you think the flip flopping of the monitoring engine is due to the Applied Config, I can setup a test and confirm my co-worker stops work for hour or so and I will not add or adjust services.
If you think the flip flopping of the monitoring engine is due to the Applied Config, I can setup a test and confirm my co-worker stops work for hour or so and I will not add or adjust services.
Re: XI System Component Status - Issues
How is he adding these changes to Nagios XI? It sounds automated, and if he's doing any testing this could cause the service to restart.
To eliminate this as a possibility, can you please clarify with him and ask him how he's adding them? I would like to rule this out as an option.
To eliminate this as a possibility, can you please clarify with him and ask him how he's adding them? I would like to rule this out as an option.
Former Nagios Employee
Re: XI System Component Status - Issues
He is making changes directly to the database after a conversation with him. He has only added one host and host group to ensure the function works and simulating the response back from Nagios. He said that function is commented while working on the script. He said he has only applied configure once about a week or so ago when he worked on that function to ensure it was working.
So, it appears he is not making changes in Nagios currently. He is simulating the requests and response without making the change.
The reason he is using the database is because it is stable and unchanging with patches and updates. The API does not support everything that is needed and it has shown to be inconsistent. He doesn't want to mess with building and importing configuration files. Using a script to manipulate the web interface would be cumbersome and require adjustments when changes to the UI happen or features get added.
So, it appears he is not making changes in Nagios currently. He is simulating the requests and response without making the change.
The reason he is using the database is because it is stable and unchanging with patches and updates. The API does not support everything that is needed and it has shown to be inconsistent. He doesn't want to mess with building and importing configuration files. Using a script to manipulate the web interface would be cumbersome and require adjustments when changes to the UI happen or features get added.
Re: XI System Component Status - Issues
I have to let you know that's it's strongly advised that you don't modify the database directly. If you do, things could break, but I trust you're aware of that. Reapplying the configuration will absolutely stop the monitoring engine temporarily, but it should recover shortly after. How much RAM does this machine have / how many CPU cores does it have?
Former Nagios Employee.
me.
me.
Re: XI System Component Status - Issues
We are not modifying the database, just adding/update/deleting data from the existing structure. Thank you for the information.
[root@<Nagios Server>]# cat /proc/meminfo
MemTotal: 65694708 kB
MemFree: 3598248 kB
MemAvailable: 60909412 kB
SwapTotal: 4194300 kB
SwapFree: 4076044 kB
[root@<Nagios Server>]# cat /proc/cpuinfo
processor : 31
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
stepping : 2
microcode : 0x31
cpu MHz : 1239.750
cache size : 20480 KB
[root@<Nagios Server>]# cat /proc/meminfo
MemTotal: 65694708 kB
MemFree: 3598248 kB
MemAvailable: 60909412 kB
SwapTotal: 4194300 kB
SwapFree: 4076044 kB
[root@<Nagios Server>]# cat /proc/cpuinfo
processor : 31
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
stepping : 2
microcode : 0x31
cpu MHz : 1239.750
cache size : 20480 KB
Re: XI System Component Status - Issues
How often are you seeing the service stop/start since fixing the kernel message queue issue?
Former Nagios Employee.
me.
me.
Re: XI System Component Status - Issues
Over the last 30 mins, I noticed it flip flopped 6 times. Additionally, my co-worker and I were not making changes during that 30 mins.
Re: XI System Component Status - Issues
Can you search the log files in the /var/log folder in that 30 minute span for any information on this and post the output here?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: XI System Component Status - Issues
Sorry for the delay. I will check the logs when it happens next time. Is there something I should be looking for specifically?