Thanks a lot for the excruciating detail, I for one appreciated it. I have
written quite a few Go- and Lua-based plugins and from that experience I would
definitely welcome a more unified, event-based API. While the Go API is very
"Go-ish" and open with its direct use of channels, it's not a detail I ever
found myself wanting for in the Lua plugins.

Testing Go plugins would also become easier, too; my Lua plugins just need the
proper Lua path environment variables and a stub of read_message(), which is far
easier to provide than the Go plugins that use a couple hundred lines worth of
pipeline mocking.

In any case, I'm really happy to hear that (a) buffering is a must-have for 1.0
and (b) you're taking the time to design an API that achieves it without
sacrificing performance, while remaining compatible with existing plugins--the
holy grail for complex changes like this.

As always, thanks for your continued work and for making Heka available to the
community; I've really enjoyed using it as a basis for my product!


Rob Miller writes:

> Thanks for asking, I'm definitely overdue to provide an update. While some of 
> you might want to know the details of what's going on, I'm sure most of you 
> don't, so I'll provide two versions of the update:
>
> TL;DR VERSION:
>
> 1.0 will be released after generic buffering for all output plugins has 
> landed and had some time to burn in for a while, so we feel comfortable that 
> it's stable. Unfortunately performance issues related to the current filter 
> and output plugin APIs have caused the implementation of generic buffering to 
> be a *lot* more work than expected. That work is happening alongside lots of 
> other stuff that needs to get done, so it has taken (and will continue to 
> take) a while to be completed.
>
> EXCRUCIATING DETAIL VERSION:
>
> There is no current schedule for a 1.0 release. Originally we were targeting 
> trying to have one out by now, but other work (including the work of actually 
> *using* Heka inside of Mozilla) has gotten in the way. The 
> buffering-for-all-outputs is the last major change that is on deck before 
> 1.0, but that's a big enough change that I'm planning to put out a 0.10 
> release so it gets some break-in time before bumping the revision to 1.0.
>
> A great deal of work has been done on the buffering front, but unfortunately 
> it's ballooned into a bigger job than expected, so there's still a lot more 
> to do. We've put it together so that the buffering happens directly off of 
> the router. Whenever a message_matcher match is found, Heka will check to see 
> whether or not the plugin is using buffering. If no, the message will be 
> placed on the plugin's input channel, as before. If yes, the message will 
> instead be placed in the plugin's disk queue. For each buffered plugin, a 
> separate goroutine will run to pull messages off of the queue and feed them 
> to the plugin.
>
> So far things are pretty straightforward, but here's where they take a turn. 
> In order to support retries, Heka needs to know whether or not a message was 
> successfully delivered. Not only that, it needs to know synchronously, before 
> it advances to the next record in the disk queue. Unfortunately, Heka's 
> current design makes this difficult; outputs run in their own goroutine, and 
> expect to get all of their messages from the input channel. This means that 
> telling Heka whether or not the message delivery was successful involves 
> bi-directional synchronization between the goroutine that is pulling from the 
> disk queue and the output plugin's main goroutine.
>
> I've come up with a way to get this all working while making minimal changes 
> to existing plugin code, but the cross-goroutine synchronization slows 
> everything down. With this scheme, I'm seeing TcpOutput throughput of about 
> 50% what we get with the existing, entirely-inside-the-plugin disk buffering. 
> That's a non-starter. We can get rid of the problem by getting rid of the 
> separate goroutines. But that involves a complete overhaul of the output 
> plugin API, so that the Go API looks more like the existing sandbox API. That 
> means plugin authors would implement `ProcessMessage` and `TimerEvent` 
> methods instead of receiving messages and ticker notifications over channels 
> like they do now.
>
> Needless to say, rewriting every output (and filter, since with this approach 
> they'll support buffering too) to use a fundamentally different API is a lot 
> of work. I'm not excited about that, but I'm not sure how else to approach 
> it. My current thinking is that we'll let both of the APIs exist side by side 
> for a while. This would mean existing plugins could support buffering, albeit 
> with a performance hit, with only trivial code changes. Much improved 
> performance would be possible, but it would require reimplementing the plugin 
> using the newer API. (Note that the performance penalty only applies when 
> buffering is used... the unbuffered performance would match what we have now 
> regardless of which API was used.)
>
> So my next steps are to hammer out the details of the new API, implement 
> support for it in the Heka core, and update the TcpOutput to support it. I'll 
> keep plugging away at it, but that work is happening alongside a number of 
> other important initiatives being worked on, so it will likely take a while 
> longer.
>
> Hope this answers your questions. Ideas, comments, feedback welcome as always.
>
> -r
>
> On 04/23/2015 10:52 AM, Tom Davis wrote:
>> The recent 0.9.2 release prompted me to wonder again about what might be 
>> planned
>> for 1.0. The milestone on GitHub seems quite outdated, aside from the generic
>> buffering for output plugins (which sounds great). Rob, Mike, do you guys 
>> have
>> any big goals for 1.0? Any timeline?
>>
>> (I don't personally have a burning desire for a particular number, it's just
>> commonly seen as the time when most of the big, breaking stuff has been
>> finished and I'm wondering what's next for Heka)
>>
>> Thanks!
>> _______________________________________________
>> Heka mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/heka
>>
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to