Page 1 of 1

From Nagios/Munin to where

PostPosted: Mon Aug 01, 2022 5:37 am
by JohnSonandrla
We want to modernize the monitoring tools for the company. We are currently using nagios-munin for monitoring, for about 5 years. The problem we have with Nagios is the config complexity, the munin side does not have a modern enough interface, so no one looks at the monitoring screen.

There are about 250 servers and they are all linux-on-premise within the company. We do not monitor any applications, only the health checks of existing servers are important to us. We want to modernize the system a bit, maybe we can monitor the hardware and drivers we tested on the servers. Or we can include jenkins and other tools in monitoring.

I've looked through a few current tools, I've also tried prometheus/grafa, zabbix, even nagios/grafana integration. Felt like the most seamless prometheus/grafana integration. However, when I did a little research, I saw that they generally prefer prometheus by application monitoring, cloud, and SaaS. Is it just unnecessary for linux servers to health check and monitor a few applications in the future? We also need to store 1-2 years of monitoring data, and we would like to see a 1-year timeline on the graphs.

In this case, what kind of comparison would you make when we put the nagios/munin, prometheus/grafana, zabbix triad on the table. As I said before, all servers are on-premise, there is no cloud omegle shagle voojio service.

Thanks in advance.

Re: From Nagios/Munin to where

PostPosted: Tue Aug 02, 2022 1:29 am
by VeroniTasingolir
Munin is a networked resource monitoring tool that can help analyze resource trends and "what just happened to kill our performance?" problems. It is designed to be very plug and play. A default installation provides a lot of graphs with almost no work.

Nagios is a monitoring (alerting) tool. Munin could be considered a replacement for Cacti.

We use both of them: Nagios and Munin.

Nagios tell us in real time if something is wrong: like web server down, database load average, etc.
Using Munin you can see the trends and the history about why that happened.

Munin and Nagios are really different tools.

Re: From Nagios/Munin to where

PostPosted: Fri Aug 05, 2022 7:02 am
by AkinBredailik
I think you misunderstand the use of "preferred" here. It's not that Prometheus prefers one thing or the other. It's that these applications are easier/prefer to be monitored with Prometheus.

Prometheus can monitor basically anything. I just also happens to good at things that Nagios isn't. It's a superset of the functionality of Nagios and Munin.

Set the Prometheus TSDB retention to store your 2 years of data. Deploy the node_exporter. You're basically good to go to go. Don't over complicate it.

Re: From Nagios/Munin to where

PostPosted: Mon Aug 08, 2022 6:50 am
by SharphSonirak
for your case you can use any of them, but better to try them all on 100 servers and —Āompare the pros and cons. For network switches, security cameras, video registrars, 7 bare metal servers and up to 20 virtual machines - I use https://gitlab.com/mikler/glaber . It is fork of zabbix ver 5.x with different patches for high load. Glaber use Clickhouse for history and trends which has a good data compassion and blaming fast query execution. For a long term storage as you mentioned "We also need to store 1-2 years of monitoring data, and we would like to see a 1-year timeline on the graphs" it will be ok. Maybe apply some tuning as well.

BTW: if you will chose the Prometheus stack - try to use VictoriaMetrics. It was build as a long term storage for Prometheus but now it can be used instead of Prometheus. See https://docs.victoriametrics.com/Single ... t-features . You can try it by running in dockers (Single and Cluster) or from setup a Single echatspin echatrandom Instance in Linode or DigitalOcean)