Pete, The extractors themselves do not look too bad, but however whenever you use leading wildcards to extract similar data, the work that the extractors have to do is repeated, since they are executed one after the other.
If there's no better way to extract that data, you might want to look into Grok patterns, as those will be executed "in parallel". For example, if you have multiple patterns that could potentially match, and then use | to combine those patterns, they get compiled down into a single regular expression. That should be faster, even though the overall expression is larger. The upside is that you can extract multiple named fields at once with Grok and can apply data type conversions in 1.1. You'll find examples in our documentation. Please note that the type conversions are a new feature in 1.1. Best, Kay On Fri, Jun 5, 2015, 2:45 AM Pete GS <[email protected]> wrote: > Hi all, > > I've finally discovered the source of my excess CPU load and high load > averages on my Graylog nodes! > > I've got a bunch of extractors that I use to pull information from my > vSphere platform's VMKernel logs. > > The catch with these is that a lot of items in the message string vary > quite a bit, so finding a regex to match is quite difficult... read pretty > much impossible for my limited regex skills :) > > The way I've worked around this is to use wildcards in the regex strings > and that seems to be causing my load average to go from ~0.4 to ~2 or even > more and the CPU's regularly peak at 100%. > > Is this expected behaviour? > > I recall an issue with earlier versions of Graylog where wildcards in > stream rules would cause this but I believe that was much improved in the > 1.0 release and I have noticed that difference. I'm running 1.0.2 at > present. > > Is there a similar improvement with extractors in 1.1 or is it being > worked on perhaps? > > I intend to put 1.1 into my test lab early next week but it doesn't see > anywhere near as many messages/sec as Production so I won't really see any > indications until I get it into Production. > > I've attached my current extractors. > > Any feedback on this would be great, and in the meantime I'll start trying > to optimise my extractors a bit more to see if I can remove some wildcards. > > Cheers, Pete > > -- > You received this message because you are subscribed to the Google Groups > "graylog2" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "graylog2" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
