Thanks a lot for the excruciating detail, I for one appreciated it. I have written quite a few Go- and Lua-based plugins and from that experience I would definitely welcome a more unified, event-based API. While the Go API is very "Go-ish" and open with its direct use of channels, it's not a detail I ever found myself wanting for in the Lua plugins.
Testing Go plugins would also become easier, too; my Lua plugins just need the proper Lua path environment variables and a stub of read_message(), which is far easier to provide than the Go plugins that use a couple hundred lines worth of pipeline mocking. In any case, I'm really happy to hear that (a) buffering is a must-have for 1.0 and (b) you're taking the time to design an API that achieves it without sacrificing performance, while remaining compatible with existing plugins--the holy grail for complex changes like this. As always, thanks for your continued work and for making Heka available to the community; I've really enjoyed using it as a basis for my product! Rob Miller writes: > Thanks for asking, I'm definitely overdue to provide an update. While some of > you might want to know the details of what's going on, I'm sure most of you > don't, so I'll provide two versions of the update: > > TL;DR VERSION: > > 1.0 will be released after generic buffering for all output plugins has > landed and had some time to burn in for a while, so we feel comfortable that > it's stable. Unfortunately performance issues related to the current filter > and output plugin APIs have caused the implementation of generic buffering to > be a *lot* more work than expected. That work is happening alongside lots of > other stuff that needs to get done, so it has taken (and will continue to > take) a while to be completed. > > EXCRUCIATING DETAIL VERSION: > > There is no current schedule for a 1.0 release. Originally we were targeting > trying to have one out by now, but other work (including the work of actually > *using* Heka inside of Mozilla) has gotten in the way. The > buffering-for-all-outputs is the last major change that is on deck before > 1.0, but that's a big enough change that I'm planning to put out a 0.10 > release so it gets some break-in time before bumping the revision to 1.0. > > A great deal of work has been done on the buffering front, but unfortunately > it's ballooned into a bigger job than expected, so there's still a lot more > to do. We've put it together so that the buffering happens directly off of > the router. Whenever a message_matcher match is found, Heka will check to see > whether or not the plugin is using buffering. If no, the message will be > placed on the plugin's input channel, as before. If yes, the message will > instead be placed in the plugin's disk queue. For each buffered plugin, a > separate goroutine will run to pull messages off of the queue and feed them > to the plugin. > > So far things are pretty straightforward, but here's where they take a turn. > In order to support retries, Heka needs to know whether or not a message was > successfully delivered. Not only that, it needs to know synchronously, before > it advances to the next record in the disk queue. Unfortunately, Heka's > current design makes this difficult; outputs run in their own goroutine, and > expect to get all of their messages from the input channel. This means that > telling Heka whether or not the message delivery was successful involves > bi-directional synchronization between the goroutine that is pulling from the > disk queue and the output plugin's main goroutine. > > I've come up with a way to get this all working while making minimal changes > to existing plugin code, but the cross-goroutine synchronization slows > everything down. With this scheme, I'm seeing TcpOutput throughput of about > 50% what we get with the existing, entirely-inside-the-plugin disk buffering. > That's a non-starter. We can get rid of the problem by getting rid of the > separate goroutines. But that involves a complete overhaul of the output > plugin API, so that the Go API looks more like the existing sandbox API. That > means plugin authors would implement `ProcessMessage` and `TimerEvent` > methods instead of receiving messages and ticker notifications over channels > like they do now. > > Needless to say, rewriting every output (and filter, since with this approach > they'll support buffering too) to use a fundamentally different API is a lot > of work. I'm not excited about that, but I'm not sure how else to approach > it. My current thinking is that we'll let both of the APIs exist side by side > for a while. This would mean existing plugins could support buffering, albeit > with a performance hit, with only trivial code changes. Much improved > performance would be possible, but it would require reimplementing the plugin > using the newer API. (Note that the performance penalty only applies when > buffering is used... the unbuffered performance would match what we have now > regardless of which API was used.) > > So my next steps are to hammer out the details of the new API, implement > support for it in the Heka core, and update the TcpOutput to support it. I'll > keep plugging away at it, but that work is happening alongside a number of > other important initiatives being worked on, so it will likely take a while > longer. > > Hope this answers your questions. Ideas, comments, feedback welcome as always. > > -r > > On 04/23/2015 10:52 AM, Tom Davis wrote: >> The recent 0.9.2 release prompted me to wonder again about what might be >> planned >> for 1.0. The milestone on GitHub seems quite outdated, aside from the generic >> buffering for output plugins (which sounds great). Rob, Mike, do you guys >> have >> any big goals for 1.0? Any timeline? >> >> (I don't personally have a burning desire for a particular number, it's just >> commonly seen as the time when most of the big, breaking stuff has been >> finished and I'm wondering what's next for Heka) >> >> Thanks! >> _______________________________________________ >> Heka mailing list >> [email protected] >> https://mail.mozilla.org/listinfo/heka >> _______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

