Hi all,
Ali, I share your opinion concerning Heka's strengths. I also think that
Heka stands out because of the flexibility of its filters. There are few
to none lightweight data collectors/shippers that allow to process
events with that many decoders/filters/encoders, with the possibility of
chaining them. The numerous filtering possibilities was what made us use
Heka.
Concerning the alternative to Heka, i.e elastic's Beats: there is
obviously a lack of outputs. However things might take a turn and you
should look (might even participate) at this recent ticket about having
community-maintained outputs:
https://github.com/elastic/beats/pull/1681
Vincent
On 2 June 2016 at 22:22, Ali <[email protected]
<mailto:[email protected]>> wrote:
Thanks, Rob!
I have to say, I'm EXTREMELY DISAPPOINTED to hear this.
I have been away from Heka for a while (working on other projects at
work) and am now able to refocus on designing our new data
collection/analysis/reporting system. Once I read this e-mail, I
started looking around to see what else was out there and what has
changed over the last several months. Elastic's Beats
<https://www.elastic.co/products/beats> project, particularly
Filebeat
<https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html>,
seemed like a really interesting and welcome development. However,
compared to the flexibility of Heka's ins and outs, Filebeats seems
to be wanting badly.
Suffice it to say, Heka still seems to stand alone in this space.
Its flexibility is amazing. (Again, mostly talking about inputs and
outputs here.) The closest I can come to it is nxlog
<http://nxlog-ce.sourceforge.net/about>, and I just really dislike
that it's not more transparent and open-source.
Anyway, I understand the rationale behind this decision and am
hopeful that another org will continue work on this project. Thanks
for all of your efforts, Rob et al!
-Ali
P.S. If anyone's interested, here's my situation right now:
https://www.reddit.com/r/bigdata/comments/4m81vo/which_log_collectors_to_use_for_robust_handling/
and
https://discuss.elastic.co/t/how-can-i-get-data-from-filebeat-to-flume/51734
On Fri, May 6, 2016 at 12:51 PM Rob Miller <[email protected]
<mailto:[email protected]>> wrote:
Hi everyone,
I'm loooong overdue in sending out an update about the current
state of
and plans for Heka. Unfortunately, what I have to share here will
probably be disappointing for many of you, and it might impact
whether
or not you want to continue using it, as all signs point to Heka
getting
less support and fewer updates moving forward.
The short version is that Heka has some design flaws that make
it hard
to incrementally improve it enough to meet the high throughput and
reliability goals that we were hoping to achieve. While it would be
possible to do a major overhaul of the code to resolve most of these
issues, I don't have the personal bandwidth to do that work,
since most
of my time is consumed working on Mozilla's immediate data
processing
needs rather than general purpose tools these days. Hindsight
(https://github.com/trink/hindsight), built around the same Lua
sandbox
technology as Heka, doesn't have these issues, and internally we're
using it more and more instead of Heka, so there's no organizational
imperative for me (or anyone else) to spend the time required to
overhaul the Go code base.
Heka is still in use here, though, especially on our edge nodes,
so it
will see a bit more improvement and at least a couple more releases.
Most notably, it's on my list to switch to using the most recent Lua
sandbox code, which will move most of the protobuf processing to
custom
C code, and will likely improve performance as well as remove a
lot of
the problematic cgo code, which is what's currently keeping us from
being able to upgrade to a recent Go version.
Beyond that, however, Heka's future is uncertain. The code
that's there
will still work, of course, but I may not be doing any further
improvements, and my ability to keep up with support requests
and PRs,
already on the decline, will likely continue to wane.
So what are the options? If you're using a significant amount of Lua
based functionality, you might consider transitioning to
Hindsight. Any
Lua code that works in Heka will work in Hindsight. Hindsight is
a much
leaner and more solid foundation. Hindsight has far fewer i/o
plugins
than Heka, though, so for many it won't be a simple transition.
Also, if there's someone out there (an organization, most
likely) that
has a strong interest in keeping Heka's codebase alive, through
funding
or coding contributions, I'd be happy to support that endeavor. Some
restrictions apply, however; the work that needs to be done to
improve
Heka's foundation is not beginner level work, and my time to help is
very limited, so I'm only willing to support folks who
demonstrate that
they are up to the task. Please contact me off-list if you or your
organization is interested.
Anyone casually following along can probably stop reading here.
Those of
you interested in the gory details can read on to hear more
about what
the issues are and how they might be resolved.
First, I'll say that I think there's a lot that Heka got right. The
basic composition of the pipeline (input -> split -> decode ->
route ->
process -> encode -> output) seems to hit a sweet spot for
composability
and reuse. The Lua sandbox, and especially the use of LPEG for text
parsing and transformation, has proven to be extremely efficient and
powerful; it's the most important and valuable part of the Heka
stack.
The routing infrastructure is efficient and solid. And, perhaps most
importantly, Heka is useful; there are a lot of you out there
using it
to get work done.
There was one fundamental mistake made, however, which is that we
shouldn't have used channels. There are many competing opinions
about Go
channels. I'm not going to get in to whether or not they're *ever* a
good idea, but I will say unequivocally that their use as the
means of
pushing messages through the Heka pipeline was a mistake, for a
number
of reasons.
First, they don't perform well enough. While Heka performs many
tasks
faster than some other popular tools, we've consistently hit a
throughput ceiling thanks to all of the synchronization that
channels
require. And this ceiling, sadly, is generally lower than is
acceptable
for the amount of data that we at Mozilla want to push through our
aggregators single system.
Second, they make it very hard to prevent message loss. If
unbuffered
channels are used everywhere, performance plummets unacceptably
due to
context-switching costs. But using buffered channels means that many
messages are in flight at a time, most of which are sitting in
channels
waiting to be processed. Keeping track of which messages have
made it
all the way through the pipeline requires complicated coordination
between chunks of code that are conceptually quite far away from
each other.
Third, the buffered channels mean that Heka consumes much more
RAM than
would be otherwise needed, since we have to pre-allocate a pool of
messages. If the pool size is too small, then Heka becomes
susceptible
to deadlocks, with all of the available packs sitting in channel
queues,
unable to be processed because some plugin is blocked on waiting
for an
available pack. But cranking up the pool size causes Heka to use
more
memory, even when it's idle.
Hindsight avoids all of these problems by using disk queues
instead of
RAM buffers between all of the processing stages. It's a bit
counterintuitive, but at high throughput performance is actually
better
than with RAM buffers, because a) there's no need for
synchronization
locks and b) the data is typically read quickly enough after it's
written that it stays in the disk cache.
There's much less chance of message loss, because every plugin is
holding on to only one message in memory at a time, while using a
written-to-disk cursor file to track the current position in the
disk
buffer. If the plug is pulled mid-process, some messages that were
already processed might be processed again, but nothing will be
lost,
and there's no need for complex coordination between different
stages of
the pipeline.
Finally, there's no need for a pool of messages. Each plugin is
holding
some small number of packs (possibly as few as one) in its own
memory
space, and those packs never escape that plugin's ownership. RAM
usage
doesn't grow, and pool exhaustion related deadlocks are a thing
of the past.
For Heka to have a viable future, it would basically need to be
updated
to work almost exactly like Hindsight. First, all of the APIs
would need
to be changed to no longer refer to channels. (The fact that we
exposed
channels to the APIs is another mistake we made... it's now
generally
frowned upon in Go land to expose channels as part of your
public APIs.)
There's already a non-channel based API for filters and outputs, but
most of the plugins haven't yet been updated to use the new API,
which
would need to happen.
Then the hard work would start; a major overhaul of Heka's
internals, to
switch from channel based message passing to disk queue based
message
passing. The work that's been done to support disk buffering for
filters
and outputs is useful, but not quite enough, because it's not
scalable
for each plugin to have its own queue; the number of open file
descriptors would grow very quickly. Instead it would need to
work like
Hindsight, where there's one queue that all of the inputs write
to, and
another that filters write to. Each plugin reads through its
specified
input queue, looking for messages that match its message matcher,
writing its location in the queue back to the shared cursors file.
There would also be some complexity in reconciling Heka's
breakdown of
the input stage into input/splitter/decoder with Hindsight's
encapsulation of all of these stages into a single sandbox.
Ultimately I think this would be at least 2-3 months full time
work for
me. I'm not the fastest coder around, but I know where the
bodies are
buried, so I'd guess it would take anyone else at least as long,
possibly longer if they're not already familiar with how
everything is
put together.
And that's about it. If you've gotten this far, thanks for reading.
Also, thanks to everyone who's contributed to Heka in any way,
be it by
code, doc fixes, bug reports, or even just appreciation. I'm
sorry for
those of you using it regularly that there's not a more stable
future.
Regards,
-r
_______________________________________________
Heka mailing list
[email protected] <mailto:[email protected]>
https://mail.mozilla.org/listinfo/heka
_______________________________________________
Heka mailing list
[email protected] <mailto:[email protected]>
https://mail.mozilla.org/listinfo/heka