On Thursday, 4 August 2016 05:41:59 UTC+8, Linwood Ferguson wrote:
>
> I'm struggling a bit to avoid the "just throw logs in and figure out later 
> what to do with them" inclination, and trying to plan how the different 
> pieces might best be used.
>
> I'd appreciate any comments as to whether this is a good approach.  I even 
> have a picture.
>
> My thinking goes like this: 
>
> 1) Bring data in and use extractors (mostly grok) to normalize to some set 
> of standardized fields, somewhat based on what I can get free from Gelf.  I 
> expect this kind of normalization will be a work in progress forever.  Grok 
> especially but extractors in general seem easier to use than pipelines for 
> normalization.
>
> 2) Let everything just stay in the default stream at that point, and feed 
> into a set of pipeline rules.
>
> 3) Pipelines decide how to map the log messages from the physical origins 
> into logical groupings, for example actual device (e.g. hardware or 
> similar) events, infrastructure logins to network gear, VPN and similar 
> access, web logs (probably different types)., etc.
>
> 3A) Garbage messages no one really cares about get dropped here.
>
> 3B) Some messages might end up in two places, e.g. we might have certain 
> data access streams which are also web or FTP logs.
>
> 4) Streams control the alarms.
>
> All wet, or going in the right direction? 
>
>

Hi Linwood, 

Thanks for sharing this, been working with the boss on a Graylog project 
and we have discussed this a few times as there are various places where 
you can filter, so what is the optimal approach?

So far we have a good setup for Linux based server hosts (primarily only 
interested in '/var/log/security' naughties and SELinux 'avc = denied' 
which generally breaks something). I tried a number of 'shippers' but the 
one we decided on was elastic 'filebeat'. 

It had the following advantages:

- Written in Go so single binary (i.e. doesn't require a JVM)
- Easy config  (yaml based do config management friendly) 
- Allows for simple filtering before the data leaves the server (instead of 
pumping all your logs and then filtering with Graylog so your only 
searching against the data you care about)
- Graylog has an input Plugin that supports the beats shipper. 

For Linux based hosts IMO this is a lot more fun than syslog manipulation 
(which you will most likely be forced to do for your other devices). So 
outside of recommending the shipper (which was recommended to me) I think 
that filtering at the source is an effective strategy.

HTH Cheers Luke.
 

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/1dbef3f3-76c1-4eda-86e4-c1b1ca0d8a16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to