On Fri, 17 Aug 2018, at 14:27, Sankalp Kohli wrote: > I am bumping this thread because patch has landed for this with repair > functionality.
We are looking to contribute Reaper to the Cassandra project. Looking at the patch it's very similar in its base design already, but Reaper does has a lot more to offer. We have all been working hard to move it to also being a side-car so it can be contributed. This raises a number of relevant questions to this thread: would we then accept both works in the Cassandra project, and what burden would it put on the current PMC to maintain both works. I share Stefan's concern that consensus had not been met around a side-car, and that it was somehow default accepted before a patch landed. This seems at odds when we're already struggling to keep up with the incoming patches/contributions, and there could be other git repos in the project we will need to support in the future too. But I'm also curious about the whole "Community over Code" angle to this, how do we encourage multiple external works to collaborate together building value in both the technical and community. The Reaper project has worked hard in building both its user and contributor base. And I would have thought these, including having the contributor base overlap with the C* PMC, were prerequisites before moving a larger body of work into the project (separate git repo or not). I guess this isn't so much "Community over Code", but it illustrates a concern regarding abandoned code when there's no existing track record of maintaining it as OSS, as opposed to expecting an existing "show, don't tell" culture. Reaper for example has stronger indicators for ongoing support and an existing OSS user base: today C* committers having contributed to Reaper are Jon, Stefan, Nate, and myself, amongst the 40 contributors in total. And we've been making steps to involve it more into the C* community (eg users ML), without being too presumptuous. On the technical side: Reaper supports (or can easily) all the concerns that the proposal here raises: distributed nodetool commands, centralising jmx interfacing, scheduling ops (repairs, snapshots, compactions, cleanups, etc), monitoring and diagnostics, etc etc. It's designed so that it can be a single instance, instance-per-datacenter, or side-car (per process). When there are multiple instances in a datacenter you get HA. You have a choice of different storage backends (memory, postgres, c*). You can ofc use a separate C* cluster as a backend so to separate infrastructure data from production data. And it's got an UI for C* Diagnostics already (which imposes a different jmx interface of polling for events rather than subscribing to jmx notifications which we know is problematic, thanks to Stefan). Anyway, that's my plug for Reaper :-) There's been little effort in evaluating these two bodies of work, one which is largely unknown to us, and my concern is how we would fairly support both going into the future? Another option would be that this side-car patch first exists as a github project for a period of time, on par to how Reaper has been. This will help evaluate its use and to first build up its contributors. This makes it easier for the C* PMC to choose which projects it would want to formally maintain, and to do so based on factors beyond merits of the technical. We may even see it converge (or collaborate more) with Reaper, a win for everyone. regards, Mick --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org