Log Distribution with a cluster best practices

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
2evanowen
Posts: 35
Joined: Fri Jan 16, 2015 1:24 pm

Log Distribution with a cluster best practices

Post by 2evanowen »

Hey guys,
I have a few questions that I ran into while patching...

When I take down my Master in the cluster (the one that I am sending all of the logs to) none of the other systems in the cluster receive logs because they are inheriting the logs from the master.
1) Will Nagios Log Server go back and get the logs it missed while it was down?
2) Is there a way for another system in the cluster to take over being the master while it's down (so I can still see current log files even while patching the original master)?
3) What is the best way for me to ensure that I don't lose logs when I am patching a server during that down time? Do you guys have a best practice for that?
polarbear1
Posts: 73
Joined: Mon Apr 13, 2015 4:26 pm

Re: Log Distribution with a cluster best practices

Post by polarbear1 »

For #1 and #3 -- Long story short, you want to set up nxlog and rsyslog to queue your messages if they are unable to send.

Rsyslog --

Check the bottom section of your files at /etc/rsyslog.d/90-nagioslogserver_*.conf and you want to make them look something like this ...

Code: Select all

# Forward to Nagios Log Server and then discard, otherwise these messages
# will end up in the syslog file (/var/log/messages) unless there are other
# overriding rules.
#Buffer Settings
$ActionResumeInterval 10
$ActionQueueSize 100000
$ActionQueueDiscardMark 97500
$ActionQueueHighWaterMark 80000
$ActionQueueType LinkedList
$ActionQueueFileName each_queue_should_have_a_unique_queue_file_name
$ActionQueueCheckpointInterval 100
$ActionQueueMaxDiskSpace 500m
$ActionResumeRetryCount -1
$ActionQueueSaveOnShutdown on
$ActionQueueTimeoutEnqueue 0
$ActionQueueDiscardSeverity 0
if $programname == 'my_log_file' then @@my_NLS_server:5544
if $programname == 'my_log_file' then ~
Also note the $WorkDirectory variable higher up in the config. This is where your disk buffer files will be dumped, so make sure you have enough space available. By default its going somewhere in the /var partition. It's OK to change to something else. In my case I made a folder for it on /home because that's where I have the most room to spare on my build.

Related reading:
http://www.rsyslog.com/doc/v8-stable/co ... ueues.html


NXlog -- Check your C:\Program Files (x86)\nxlog\conf\nxlog.conf files...

Add these sections after the <Extension> sections. Note the sizes are in bits, change it based on what fits your situation.

Code: Select all

<Processor membuffer>
    Module  pm_buffer
    MaxSize 512000
    Type    Mem
</Processor>

<Processor diskbuffer>
    Module  pm_buffer
    MaxSize 5242880
    Type    Disk
    File    "C:\My\Working\Directory"
    WarnLimit   3932160
</Processor>
Then at the bottom, tweak your route to something like this

Code: Select all

<Route 1>
    Path my_file_1, my_file_2 => diskbuffer => membuffer => out
</Route>
I know the Route logic looks a little weird, but that's because it works backwards. It will try to send out but if it can't, it will go to membuffer, if the membuffer is full it will go to diskbuffer, and if diskbuffer is full it will just start discarding.

When you reconnect, all your queues will be sent to NLS (but it may take a while to catch up). The rsyslog queues will even have the correct timestamps (it you can watch it backfill the gap in your dashboard). I am still trying to figure out how to make nxlog do this - right now it just dumps the whole queue with the current timestamp, not when the event actually happened.


As for #2, and I'm only guessing here so I'd wait for the official answer if you setup DNS round robin and point your client servers to send logs to @@MY_NLS_CLUSTER:5544 (instead of @@MY_NLS_BOX_1:5544) then if NLS_BOX_1 goes down, it should relatively transparently just go to NLS_BOX_X.

Hopefully jolson will chime in with a better answer, but this should give you something to think about for now. Cheers.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Log Distribution with a cluster best practices

Post by jolson »

polarbear1,

Thanks for taking the time to write out all of this - it's very helpful. :)

I just wanted to add some information regarding question #2.
2) Is there a way for another system in the cluster to take over being the master while it's down (so I can still see current log files even while patching the original master)?
Being the master instance of a cluster does not have any attachment to how logs are received by the cluster. You can send your logs to any instance (master or non master) and the logs will be processed the same.

That being said, the way to get around this to either use a hardware load balancer (F5 or similar) or DNS Round Robin balancing. There are pros and cons to each of these methods, and a great discussion took place here: https://support.nagios.com/forum/viewto ... 38&t=33005 (It's on the customer only forum).
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
2evanowen
Posts: 35
Joined: Fri Jan 16, 2015 1:24 pm

Re: Log Distribution with a cluster best practices

Post by 2evanowen »

Thanks guys this should be enough to get me rolling with a solution.
:)
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Log Distribution with a cluster best practices

Post by tmcdonald »

Great! I'll be closing this thread now, but feel free to open another if you need anything in the future!

And thanks again to @polarbear1!
Former Nagios employee
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Log Distribution with a cluster best practices

Post by jolson »

Sounds good! I'll lock the thread.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked