NagiosXI had a seizure
Re: NagiosXI had a seizure
I took a look at the MT iSMS component, and unless I'm reading the code wrong it appears to be adding another event handler to Nagios that makes an HTTP call to send the SMS. So this would mean it's either queued in the device itself, or (more likely) on the server that sends the SMS.
Former Nagios employee
Re: NagiosXI had a seizure
I'm using the multitech component and using a multitech network attached sms modem. It literally took all day for the SMS's to all get through to the phones.abrist wrote:SMS can be sent one of two ways. If you use the XI mailer, they are sent out immediately, no queue, and may be queued on the carrier's servers. If you send SMS with sendmail, then you can check the queue with "mailq".
My mysql server is virtual and has plenty of horse power behind it, not sure what I could do about latency there. So I was wanting to do the ramdisk thing, but I wouldn't mind help from him so I make sure and don't miss anything. I can always get my linux admin to help, but I figure why not ask the expertsabrist wrote:My suspicion is load/io wait on the mysql server, io wait on the nagios server during the config write out process, or network latency.
I think Spencer is suggesting creating a ramdisk for /usr/local/nagios/etc/ and then rsyncing it to somewhere for a backup.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: NagiosXI had a seizure
yes, it does it via http. I could see the outbox filling up all day like nagios was still making those http posts all day longtmcdonald wrote:I took a look at the MT iSMS component, and unless I'm reading the code wrong it appears to be adding another event handler to Nagios that makes an HTTP call to send the SMS. So this would mean it's either queued in the device itself, or (more likely) on the server that sends the SMS.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: NagiosXI had a seizure
Andy is correct, I would suggest ramdisking /usr/local/nagios/etc and rsyncing either after apply config, or after a set interval. Something I use on my personal systems, and is likely a little controversial for production, is the Anything-Sync-Daemon from ArchLinux land. You configure it as a system service, and setup the fstab rules to mount that directory to tmpfs, and it handles the rsync for you. This way in the event of a system crash or otherwise, you can start this daemon, allow it to copy your configs to memory, then start nagios as though nothing happened. Additionally, most configs, with the notable exceptions being nagios.cfg, ndoutls.cfg and such, would be repopulated via an apply config to memory at the absolute very worst. Obviously I would suggest testing this on a non-prod server first.
The profile-sync-daemon wiki article provides a little more indepth information on what is actually happening and should be configured. Also realize that Arch uses systemd\systemctl and might have some slight other differences from Cent, but by and large it should be entirely possible to change the systemd\ctl stuff to an init.d script. I can work on testing some of this if this is the route you would like to go.
The profile-sync-daemon wiki article provides a little more indepth information on what is actually happening and should be configured. Also realize that Arch uses systemd\systemctl and might have some slight other differences from Cent, but by and large it should be entirely possible to change the systemd\ctl stuff to an init.d script. I can work on testing some of this if this is the route you would like to go.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: NagiosXI had a seizure
As the mail is posted through an http request to the modem, the queue was probably on the modem or on the carriers relay servers. Have you looked at the modems web interface for a way to clear the queue?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: NagiosXI had a seizure
yeah, just didn't very deep. I'll look more later.abrist wrote:As the mail is posted through an http request to the modem, the queue was probably on the modem or on the carriers relay servers. Have you looked at the modems web interface for a way to clear the queue?
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: NagiosXI had a seizure
Thanks...meeting with my linux admin in 30 minutes to discuss and test on my dev server.sreinhardt wrote:Andy is correct, I would suggest ramdisking /usr/local/nagios/etc and rsyncing either after apply config, or after a set interval. Something I use on my personal systems, and is likely a little controversial for production, is the Anything-Sync-Daemon from ArchLinux land. You configure it as a system service, and setup the fstab rules to mount that directory to tmpfs, and it handles the rsync for you. This way in the event of a system crash or otherwise, you can start this daemon, allow it to copy your configs to memory, then start nagios as though nothing happened. Additionally, most configs, with the notable exceptions being nagios.cfg, ndoutls.cfg and such, would be repopulated via an apply config to memory at the absolute very worst. Obviously I would suggest testing this on a non-prod server first.
The profile-sync-daemon wiki article provides a little more indepth information on what is actually happening and should be configured. Also realize that Arch uses systemd\systemctl and might have some slight other differences from Cent, but by and large it should be entirely possible to change the systemd\ctl stuff to an init.d script. I can work on testing some of this if this is the route you would like to go.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: NagiosXI had a seizure
You're welcome! Let us know what you decide. As I said, if you do go this route, I am happy to test here too, it is something I have been planning to implement anyway for higher load\end systems as it should provide massive nagios reload\restart improvements. (ns vs ms read latency)
Last edited by sreinhardt on Mon Dec 16, 2013 1:40 pm, edited 1 time in total.
Reason: words and stuff
Reason: words and stuff
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: NagiosXI had a seizure
Spenser(My temporary new best friend),
Linux admin and I did these steps on my dev NagiosXI server:
As you can see, the files have been updated, except the nagios.cfg, ndo2db.cfg and the few other files. So now, i am tryign to figure out the best way to actually implement and test this. How do I get those unchanging config files(nagios.cfg, nrpe.cfg, etc) to be on the ramdisk upon a server restart. I have no issue adding a step to the reboot/shutdown proceedures...for example...
Adding to the shutdown server proceedure:
Linux admin and I did these steps on my dev NagiosXI server:
- mv /usr/local/nagios/etc to some temp location
- recreated the etc folder
- mount -t tmpfs none /usr/local/nagios/etc -o size=50m
- chown apache:nagios to that folder and copied everything from the temp location to the new ramdisk
Code: Select all
-rw-rw-r-- 1 apache nagios 793 Dec 16 14:59 cgi.cfg
-rw-rw-r-- 1 apache nagios 25826 Dec 16 15:00 commands.cfg
-rw-rw-r-- 1 apache nagios 1073 Dec 16 15:00 contactgroups.cfg
-rw-rw-r-- 1 apache nagios 2682 Dec 16 15:00 contacts.cfg
-rw-rw-r-- 1 apache nagios 1500 Dec 16 15:00 contacttemplates.cfg
-rw-rw-r-- 1 apache nagios 642 Dec 16 15:00 hostdependencies.cfg
-rw-rw-r-- 1 apache nagios 644 Dec 16 15:00 hostescalations.cfg
-rw-rw-r-- 1 apache nagios 662 Dec 16 15:00 hostextinfo.cfg
-rw-rw-r-- 1 apache nagios 984 Dec 16 15:00 hostgroups.cfg
drwxrwxr-x 2 apache nagios 320 Dec 16 14:59 hosts
-rw-rw-r-- 1 apache nagios 13940 Dec 16 15:00 hosttemplates.cfg
drwxrwxr-x 2 apache nagios 40 Dec 16 14:59 import
-rwxrwxr-x 1 apache nagios 5764 Dec 16 14:59 nagios.cfg
-rw-rw-r-- 1 apache nagios 2229 Dec 16 14:59 ndo2db.cfg
-rw-rw-r-- 1 apache nagios 4827 Dec 16 14:59 ndomod.cfg
-rw-rw-r-- 1 apache nagios 7227 Dec 16 14:59 nrpe.cfg
-rw-rw-r-- 1 apache nagios 5374 Dec 16 14:59 nsca.cfg
drwxrwxr-x 4 apache nagios 260 Dec 16 14:59 pnp
-rwxrwxr-x 1 apache nagios 210 Dec 16 14:59 resource.cfg
-rw-rw-r-- 1 apache nagios 1627 Dec 16 14:59 send_nsca.cfg
-rw-rw-r-- 1 apache nagios 648 Dec 16 15:00 servicedependencies.cfg
-rw-rw-r-- 1 apache nagios 650 Dec 16 15:00 serviceescalations.cfg
-rw-rw-r-- 1 apache nagios 668 Dec 16 15:00 serviceextinfo.cfg
-rw-rw-r-- 1 apache nagios 638 Dec 16 15:00 servicegroups.cfg
drwxrwxr-x 2 apache nagios 260 Dec 16 14:59 services
-rw-rw-r-- 1 apache nagios 21181 Dec 16 15:00 servicetemplates.cfg
drwxrwxr-x 2 apache nagios 100 Dec 16 14:59 static
-rw-rw-r-- 1 apache nagios 5104 Dec 16 15:00 timeperiods.cfg
[root@rn000002 etc]#
Adding to the shutdown server proceedure:
- Copy everything from ramdisk to a temp/backup location(or even create a cron to do it hourly, its only 10MB)
- Copy everything from temp/backup location to the ramdisk and go to gui and hit apply changes
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: NagiosXI had a seizure
The links I suggested before were specifically for rsyncing all files in that directory. Otherwise, yes doing an on boot and on shutdown copy would likely suffice. I would suggest doing a regular cron or other copy\sync as well just in case the server flips out again. Otherwise I think you got it just right! Have you noticed any improvements?
Edit, it is also probably necessary to force nagios and related services not to load until those files can be copied over. Maybe alter your boot script so that it loads configs onto the ramdisk, then starts the proper services, and disable the services otherwise from starting on boot.
Edit, it is also probably necessary to force nagios and related services not to load until those files can be copied over. Maybe alter your boot script so that it loads configs onto the ramdisk, then starts the proper services, and disable the services otherwise from starting on boot.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.