RE: QA signup

2018-09-07 Thread Per Otterström
This is a great initiative! If we can create a structured verification approach 
on new and old features I think 4.0 will be up for a good start. Jon, you can 
add my team to that signup sheet.

/pelle

-Original Message-
From: Varun Barala  
Sent: den 6 september 2018 15:16
To: dev@cassandra.apache.org
Subject: Re: QA signup

+1
I personally would like to contribute.

On Thu, Sep 6, 2018 at 8:51 PM Jonathan Haddad  wrote:

> For 4.0, I'm thinking it would be a good idea to put together a list 
> of the things that need testing and see if people are willing to help 
> test / break those things.  My goal here is to get as much coverage as 
> possible, and let folks focus on really hammering on specific things 
> rather than just firing up a cluster and rubber stamping it.  If we're 
> going to be able to confidently deploy 4.0 quickly after it's release 
> we're going to need a high attention to detail.
>
> In addition to a signup sheet, I think providing some guidance on how 
> to QA each thing that's being tested would go a long way.  Throwing 
> "hey please test sstable streaming" over the wall will only get 
> quality feedback from folks that are already heavily involved in the 
> development process.  It would be nice to bring some new faces into 
> the project by providing a little guidance.
>
> We could help facilitate this even further by considering the people 
> signing up to test a particular feature as a team, with seasoned 
> Cassandra veterans acting as team leads.
>
> Any thoughts?  I'm happy to take the lead on this.
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: QA signup

2018-09-07 Thread Hyunsoo Lee
I would like to contribute as well.

Best,
Hyunsoo

On Fri., Sep. 7, 2018, 7:22 a.m. Per Otterström, <
per.otterst...@ericsson.com> wrote:

> This is a great initiative! If we can create a structured verification
> approach on new and old features I think 4.0 will be up for a good start.
> Jon, you can add my team to that signup sheet.
>
> /pelle
>
> -Original Message-
> From: Varun Barala 
> Sent: den 6 september 2018 15:16
> To: dev@cassandra.apache.org
> Subject: Re: QA signup
>
> +1
> I personally would like to contribute.
>
> On Thu, Sep 6, 2018 at 8:51 PM Jonathan Haddad  wrote:
>
> > For 4.0, I'm thinking it would be a good idea to put together a list
> > of the things that need testing and see if people are willing to help
> > test / break those things.  My goal here is to get as much coverage as
> > possible, and let folks focus on really hammering on specific things
> > rather than just firing up a cluster and rubber stamping it.  If we're
> > going to be able to confidently deploy 4.0 quickly after it's release
> > we're going to need a high attention to detail.
> >
> > In addition to a signup sheet, I think providing some guidance on how
> > to QA each thing that's being tested would go a long way.  Throwing
> > "hey please test sstable streaming" over the wall will only get
> > quality feedback from folks that are already heavily involved in the
> > development process.  It would be nice to bring some new faces into
> > the project by providing a little guidance.
> >
> > We could help facilitate this even further by considering the people
> > signing up to test a particular feature as a team, with seasoned
> > Cassandra veterans acting as team leads.
> >
> > Any thoughts?  I'm happy to take the lead on this.
> > --
> > Jon Haddad
> > http://www.rustyrazorblade.com
> > twitter: rustyrazorblade
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: QA signup

2018-09-07 Thread Jonathan Haddad
Really good idea JD. Keeping all the tests under an umbrella ticket for the
feature with everything linked back makes a lot of sense.

On Thu, Sep 6, 2018 at 11:09 PM J. D. Jordan 
wrote:

> I would suggest that JIRA’s tagged as 4.0 blockers be created for the list
> once it is fleshed out.  Test plans and results could be posted to said
> JIRAs, to be closed once a given test passes. Any bugs found can also then
> be related back to such a ticket for tracking them as well.
>
> -Jeremiah
>
> > On Sep 6, 2018, at 12:27 PM, Jonathan Haddad  wrote:
> >
> > I completely agree with you, Sankalp.  I didn't want to dig too deep into
> > the underlying testing methodology (and I still think we shouldn't just
> > yet) but if the goal is to have confidence in the release, our QA process
> > needs to be comprehensive.
> >
> > I believe that having focused teams for each component with a team leader
> > with support from committers & contributors gives us the best shot at
> > defining large scale functional tests that can be used to form both
> > progress and bug reports.  (A person could / hopefully will be on more
> than
> > one team).  Coming up with those comprehensive tests will be the jobs of
> > the teams, getting frequent bidirectional feedback on the dev ML.  Bugs
> go
> > in JIRA as per usual.
> >
> > Hopefully we can continue this process after the release, giving the
> > project more structure, and folding more people in over time as
> > contributors and ideally committers / PMC.
> >
> > Jon
> >
> >
> >> On Thu, Sep 6, 2018 at 1:15 PM sankalp kohli 
> wrote:
> >>
> >> Thanks for starting this Jon.
> >> Instead of saying "I tested streaming", we should define what all was
> >> tested like was all data transferred, what happened when stream failed,
> >> etc.
> >> Based on talking to a few users, looks like most testing is done by
> doing
> >> an operation or running a load and seeing if it "worked" and no errors
> in
> >> logs.
> >>
> >> Another important thing will be to fix bugs asap ahead of testing,  as
> >> fixes can lead to more bugs :)
> >>
>  On Thu, Sep 6, 2018 at 7:52 AM Jonathan Haddad 
> wrote:
> >>>
> >>> I was thinking along the same lines.  For this to be successful I think
> >>> either weekly or bi-weekly summary reports back to the mailing list by
> >> the
> >>> team lead for each subsection on what's been tested and how it's been
> >>> tested will help keep things moving along.
> >>>
> >>> In my opinion the lead for each team should *not* be the contributor
> that
> >>> wrote the feature, but someone who's very interested in it and can use
> >> the
> >>> contributor as a resource.  I think it would be difficult for the
> >>> contributor to poke holes in their own work - if they could do that it
> >>> would have been done already.  This should be a verification process
> >> that's
> >>> independent as possible from the original work.
> >>>
> >>> In addition to the QA process, it would be great if we could get a docs
> >>> team together.  We've got quite a bit of undocumented features and
> nuance
> >>> still, I think hammering that out would be a good idea.  Mick brought
> up
> >>> updating the website docs in the thread on testing different JDK's [1],
> >> if
> >>> we could figure that out in the process we'd be in a really great
> >> position
> >>> from the user perspective.
> >>>
> >>> Jon
> >>>
> >>> [1]
> >>
> https://lists.apache.org/thread.html/5645178efb57939b96e73ab9c298e80ad8e76f11a563b4d250c1ae38@%3Cdev.cassandra.apache.org%3E
> >>>
> > On Thu, Sep 6, 2018 at 10:35 AM Jordan West 
> wrote:
> 
>  Thanks for staring this thread Jon!
> 
> > On Thu, Sep 6, 2018 at 5:51 AM Jonathan Haddad 
>  wrote:
> 
> > For 4.0, I'm thinking it would be a good idea to put together a list
> >> of
>  the
> > things that need testing and see if people are willing to help test /
>  break
> > those things.  My goal here is to get as much coverage as possible,
> >> and
>  let
> > folks focus on really hammering on specific things rather than just
>  firing
> > up a cluster and rubber stamping it.  If we're going to be able to
> > confidently deploy 4.0 quickly after it's release we're going to
> >> need a
> > high attention to detail.
>  +1 to a more coordinated effort. I think we could use the Confluence
> >> that
>  was set up a little bit ago since it was setup for this purpose, at
> >> least
>  for finalized plans and results:
>  https://cwiki.apache.org/confluence/display/CASSANDRA.
> 
> 
> > In addition to a signup sheet, I think providing some guidance on how
> >>> to
>  QA
> > each thing that's being tested would go a long way.  Throwing "hey
> >>> please
> > test sstable streaming" over the wall will only get quality feedback
> >>> from
> > folks that are already heavily involved in the development process.
> >> It
> > would be nice to bring some new faces into the project b

Re: QA signup

2018-09-07 Thread Joseph Lynch
I don't think anyone has mentioned this yet but we probably want to
consider releasing 4.0 alpha jars to maven central soon so the open
source ecosystem can start testing a consistent Cassandra 4.0; for
example I had to hack 4.0 into Priam's build [1] by manually building
a jar and checking it in which is ... not particularly good or
reproducible for others. I'm not sure how hard it would be but
supporting periodic SNAPSHOT releases would also at least allow
building against trunk and would be great too. It also might be a good
idea to have a document (confluence page?) of breaking changes that
are most likely to require a change from users. For example the
SeedProvider interface change is probably going to break almost
everyone's deployment (but is easy to fix), and having a central list
of removed yaml options might be helpful past the NEWs file.

Regarding testing areas, we deployed trunk in the Netflix testing
environment on Wednesday with the aim to test the netty internode
messaging subsystem on 200+ node clusters. We are working with Jason,
Dinesh, and Jordan and have already found some interesting results and
would like to write them down as well as working on establishing good
baselines and testing methodology for stressing that subsystem. Is the
consensus here to create Jira epics tagged with 4.0 blocker for each
subsystem, or confluence pages (if confluence I think we need to give
people permissions to add pages?)?

Other areas we can help test and are looking for collaborators on are
audit/full query logging and we are potentially interested in helping
to test repair, but our internal implementation doesn't support
Cassandra 4.x ... we can re-work the CASSANDRA-14346 patch without too
much effort I think to thoroughly test full/incremental repair at any
scale cluster (or maybe Reaper folks can test repair).

[1] 
https://github.com/Netflix/Priam/pull/713/files#diff-3c33bef9f0334cf724470d50eae8dd8b

-Joey

On Fri, Sep 7, 2018 at 9:57 AM Jonathan Haddad  wrote:
>
> Really good idea JD. Keeping all the tests under an umbrella ticket for the
> feature with everything linked back makes a lot of sense.
>
> On Thu, Sep 6, 2018 at 11:09 PM J. D. Jordan 
> wrote:
>
> > I would suggest that JIRA’s tagged as 4.0 blockers be created for the list
> > once it is fleshed out.  Test plans and results could be posted to said
> > JIRAs, to be closed once a given test passes. Any bugs found can also then
> > be related back to such a ticket for tracking them as well.
> >
> > -Jeremiah
> >
> > > On Sep 6, 2018, at 12:27 PM, Jonathan Haddad  wrote:
> > >
> > > I completely agree with you, Sankalp.  I didn't want to dig too deep into
> > > the underlying testing methodology (and I still think we shouldn't just
> > > yet) but if the goal is to have confidence in the release, our QA process
> > > needs to be comprehensive.
> > >
> > > I believe that having focused teams for each component with a team leader
> > > with support from committers & contributors gives us the best shot at
> > > defining large scale functional tests that can be used to form both
> > > progress and bug reports.  (A person could / hopefully will be on more
> > than
> > > one team).  Coming up with those comprehensive tests will be the jobs of
> > > the teams, getting frequent bidirectional feedback on the dev ML.  Bugs
> > go
> > > in JIRA as per usual.
> > >
> > > Hopefully we can continue this process after the release, giving the
> > > project more structure, and folding more people in over time as
> > > contributors and ideally committers / PMC.
> > >
> > > Jon
> > >
> > >
> > >> On Thu, Sep 6, 2018 at 1:15 PM sankalp kohli 
> > wrote:
> > >>
> > >> Thanks for starting this Jon.
> > >> Instead of saying "I tested streaming", we should define what all was
> > >> tested like was all data transferred, what happened when stream failed,
> > >> etc.
> > >> Based on talking to a few users, looks like most testing is done by
> > doing
> > >> an operation or running a load and seeing if it "worked" and no errors
> > in
> > >> logs.
> > >>
> > >> Another important thing will be to fix bugs asap ahead of testing,  as
> > >> fixes can lead to more bugs :)
> > >>
> >  On Thu, Sep 6, 2018 at 7:52 AM Jonathan Haddad 
> > wrote:
> > >>>
> > >>> I was thinking along the same lines.  For this to be successful I think
> > >>> either weekly or bi-weekly summary reports back to the mailing list by
> > >> the
> > >>> team lead for each subsection on what's been tested and how it's been
> > >>> tested will help keep things moving along.
> > >>>
> > >>> In my opinion the lead for each team should *not* be the contributor
> > that
> > >>> wrote the feature, but someone who's very interested in it and can use
> > >> the
> > >>> contributor as a resource.  I think it would be difficult for the
> > >>> contributor to poke holes in their own work - if they could do that it
> > >>> would have been done already.  This should be a verification process
> > >> that's
> 

Re: QA signup

2018-09-07 Thread Scott Andreas
Thanks for getting started with performance testing - this is exciting to hear!

Periodic SNAPSHOT builds sounds great. I'd feel much better about builds 
published as date- or SHA-stamped snapshots / nightlies rather than calling 
them alphas at this point, as everyone's testing work is beginning. Can someone 
offer details on what would need to be done to publish snapshots or nightlies 
in the context of Apache build infrastructure?

In looking at the Confluence space restrictions, it appears the main page is 
open for editing and I don't see restrictions on page creation; can you try to 
sign in, create one, and let me know if that doesn't work?

I'm planning to create a few more pages for a couple testing tracks in a few 
days, some of which are described here [1]. Eager to collaborate on these as 
well.

–––
[1] http://cassandra.apache.org/blog/2018/08/21/testing_apache_cassandra.html.


From: Joseph Lynch 
Sent: Friday, September 7, 2018 1:20:19 PM
To: dev@cassandra.apache.org
Subject: Re: QA signup

I don't think anyone has mentioned this yet but we probably want to
consider releasing 4.0 alpha jars to maven central soon so the open
source ecosystem can start testing a consistent Cassandra 4.0; for
example I had to hack 4.0 into Priam's build [1] by manually building
a jar and checking it in which is ... not particularly good or
reproducible for others. I'm not sure how hard it would be but
supporting periodic SNAPSHOT releases would also at least allow
building against trunk and would be great too. It also might be a good
idea to have a document (confluence page?) of breaking changes that
are most likely to require a change from users. For example the
SeedProvider interface change is probably going to break almost
everyone's deployment (but is easy to fix), and having a central list
of removed yaml options might be helpful past the NEWs file.

Regarding testing areas, we deployed trunk in the Netflix testing
environment on Wednesday with the aim to test the netty internode
messaging subsystem on 200+ node clusters. We are working with Jason,
Dinesh, and Jordan and have already found some interesting results and
would like to write them down as well as working on establishing good
baselines and testing methodology for stressing that subsystem. Is the
consensus here to create Jira epics tagged with 4.0 blocker for each
subsystem, or confluence pages (if confluence I think we need to give
people permissions to add pages?)?

Other areas we can help test and are looking for collaborators on are
audit/full query logging and we are potentially interested in helping
to test repair, but our internal implementation doesn't support
Cassandra 4.x ... we can re-work the CASSANDRA-14346 patch without too
much effort I think to thoroughly test full/incremental repair at any
scale cluster (or maybe Reaper folks can test repair).

[1] 
https://github.com/Netflix/Priam/pull/713/files#diff-3c33bef9f0334cf724470d50eae8dd8b

-Joey

On Fri, Sep 7, 2018 at 9:57 AM Jonathan Haddad  wrote:
>
> Really good idea JD. Keeping all the tests under an umbrella ticket for the
> feature with everything linked back makes a lot of sense.
>
> On Thu, Sep 6, 2018 at 11:09 PM J. D. Jordan 
> wrote:
>
> > I would suggest that JIRA’s tagged as 4.0 blockers be created for the list
> > once it is fleshed out.  Test plans and results could be posted to said
> > JIRAs, to be closed once a given test passes. Any bugs found can also then
> > be related back to such a ticket for tracking them as well.
> >
> > -Jeremiah
> >
> > > On Sep 6, 2018, at 12:27 PM, Jonathan Haddad  wrote:
> > >
> > > I completely agree with you, Sankalp.  I didn't want to dig too deep into
> > > the underlying testing methodology (and I still think we shouldn't just
> > > yet) but if the goal is to have confidence in the release, our QA process
> > > needs to be comprehensive.
> > >
> > > I believe that having focused teams for each component with a team leader
> > > with support from committers & contributors gives us the best shot at
> > > defining large scale functional tests that can be used to form both
> > > progress and bug reports.  (A person could / hopefully will be on more
> > than
> > > one team).  Coming up with those comprehensive tests will be the jobs of
> > > the teams, getting frequent bidirectional feedback on the dev ML.  Bugs
> > go
> > > in JIRA as per usual.
> > >
> > > Hopefully we can continue this process after the release, giving the
> > > project more structure, and folding more people in over time as
> > > contributors and ideally committers / PMC.
> > >
> > > Jon
> > >
> > >
> > >> On Thu, Sep 6, 2018 at 1:15 PM sankalp kohli 
> > wrote:
> > >>
> > >> Thanks for starting this Jon.
> > >> Instead of saying "I tested streaming", we should define what all was
> > >> tested like was all data transferred, what happened when stream failed,
> > >> etc.
> > >> Based on talking to a few users, looks like most 

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Jeff Jirsa
How can we continue moving this forward?

Mick/Jon/TLP folks, is there a path here where we commit the
Netflix-provided management process, and you augment Reaper to work with it?
Is there a way we can make a larger umbrella that's modular that can
support either/both?
Does anyone believe there's a clear, objective argument that one is
strictly better than the other? I haven't seen one.



On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala
 wrote:

> +1 to everything that Joey articulated with emphasis on the fact that
> contributions should be evaluated based on the merit of code and their
> value add to the whole offering. I  hope it does not matter whether that
> contribution comes from PMC member or a person who is not a committer. I
> would like the process to be such that it encourages the new members to be
> a part of the community and not shy away from contributing to the code
> assuming their contributions are valued differently than committers or PMC
> members. It would be sad to see the contributions decrease if we go down
> that path.
>
> *Regards,*
>
> *Roopa Tangirala*
>
> Engineering Manager CDE
>
> *(408) 438-3156 - mobile*
>
>
>
>
>
>
> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch 
> wrote:
>
> > > We are looking to contribute Reaper to the Cassandra project.
> > >
> > Just to clarify are you proposing contributing Reaper as a project via
> > donation or you are planning on contributing the features of Reaper as
> > patches to Cassandra? If the former how far along are you on the donation
> > process? If the latter, when do you think you would have patches ready
> for
> > consideration / review?
> >
> >
> > > Looking at the patch it's very similar in its base design already, but
> > > Reaper does has a lot more to offer. We have all been working hard to
> > move
> > > it to also being a side-car so it can be contributed. This raises a
> > number
> > > of relevant questions to this thread: would we then accept both works
> in
> > > the Cassandra project, and what burden would it put on the current PMC
> to
> > > maintain both works.
> > >
> > I would hope that we would collaborate on merging the best parts of all
> > into the official Cassandra sidecar, taking the always on, shared
> nothing,
> > highly available system that we've contributed a patchset for and adding
> in
> > many of the repair features (e.g. schedules, a nice web UI) that Reaper
> > has.
> >
> >
> > > I share Stefan's concern that consensus had not been met around a
> > > side-car, and that it was somehow default accepted before a patch
> landed.
> >
> >
> > I feel this is not correct or fair. The sidecar and repair discussions
> have
> > been anything _but_ "default accepted". The timeline of consensus
> building
> > involving the management sidecar and repair scheduling plans:
> >
> > Dec 2016: Vinay worked with Jon and Alex to try to collaborate on Reaper
> to
> > come up with design goals for a repair scheduler that could work at
> Netflix
> > scale.
> >
> > ~Feb 2017: Netflix believes that the fundamental design gaps prevented us
> > from using Reaper as it relies heavily on remote JMX connections and
> > central coordination.
> >
> > Sep. 2017: Vinay gives a lightning talk at NGCC about a highly available
> > and distributed repair scheduling sidecar/tool. He is encouraged by
> > multiple committers to build repair scheduling into the daemon itself and
> > not as a sidecar so the database is truly eventually consistent.
> >
> > ~Jun. 2017 - Feb. 2018: Based on internal need and the positive feedback
> at
> > NGCC, Vinay and myself prototype the distributed repair scheduler within
> > Priam and roll it out at Netflix scale.
> >
> > Mar. 2018: I open a Jira (CASSANDRA-14346) along with a detailed 20 page
> > design document for adding repair scheduling to the daemon itself and
> open
> > the design up for feedback from the community. We get feedback from Alex,
> > Blake, Nate, Stefan, and Mick. As far as I know there were zero proposals
> > to contribute Reaper at this point. We hear the consensus that the
> > community would prefer repair scheduling in a separate distributed
> sidecar
> > rather than in the daemon itself and we re-work the design to match this
> > consensus, re-aligning with our original proposal at NGCC.
> >
> > Apr 2018: Blake brings the discussion of repair scheduling to the dev
> list
> > (
> >
> >
> https://lists.apache.org/thread.html/760fbef677f27aa5c2ab4c375c7efeb81304fea428deff986ba1c2eb@%3Cdev.cassandra.apache.org%3E
> > ).
> > Many community members give positive feedback that we should solve it as
> > part of Cassandra and there is still no mention of contributing Reaper at
> > this point. The last message is my attempted summary giving context on
> how
> > we want to take the best of all the sidecars (OpsCenter, Priam, Reaper)
> and
> > ship them with Cassandra.
> >
> > Apr. 2018: Dinesh opens CASSANDRA-14395 along with a public design
> document
> > for gathering feedback on a general mana

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Jonathan Haddad
We haven’t even defined any requirements for an admin tool. It’s hard to
make a case for anything without agreement on what we’re trying to build.

On Fri, Sep 7, 2018 at 7:17 PM Jeff Jirsa  wrote:

> How can we continue moving this forward?
>
> Mick/Jon/TLP folks, is there a path here where we commit the
> Netflix-provided management process, and you augment Reaper to work with
> it?
> Is there a way we can make a larger umbrella that's modular that can
> support either/both?
> Does anyone believe there's a clear, objective argument that one is
> strictly better than the other? I haven't seen one.
>
>
>
> On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala
>  wrote:
>
> > +1 to everything that Joey articulated with emphasis on the fact that
> > contributions should be evaluated based on the merit of code and their
> > value add to the whole offering. I  hope it does not matter whether that
> > contribution comes from PMC member or a person who is not a committer. I
> > would like the process to be such that it encourages the new members to
> be
> > a part of the community and not shy away from contributing to the code
> > assuming their contributions are valued differently than committers or
> PMC
> > members. It would be sad to see the contributions decrease if we go down
> > that path.
> >
> > *Regards,*
> >
> > *Roopa Tangirala*
> >
> > Engineering Manager CDE
> >
> > *(408) 438-3156 - mobile*
> >
> >
> >
> >
> >
> >
> > On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch 
> > wrote:
> >
> > > > We are looking to contribute Reaper to the Cassandra project.
> > > >
> > > Just to clarify are you proposing contributing Reaper as a project via
> > > donation or you are planning on contributing the features of Reaper as
> > > patches to Cassandra? If the former how far along are you on the
> donation
> > > process? If the latter, when do you think you would have patches ready
> > for
> > > consideration / review?
> > >
> > >
> > > > Looking at the patch it's very similar in its base design already,
> but
> > > > Reaper does has a lot more to offer. We have all been working hard to
> > > move
> > > > it to also being a side-car so it can be contributed. This raises a
> > > number
> > > > of relevant questions to this thread: would we then accept both works
> > in
> > > > the Cassandra project, and what burden would it put on the current
> PMC
> > to
> > > > maintain both works.
> > > >
> > > I would hope that we would collaborate on merging the best parts of all
> > > into the official Cassandra sidecar, taking the always on, shared
> > nothing,
> > > highly available system that we've contributed a patchset for and
> adding
> > in
> > > many of the repair features (e.g. schedules, a nice web UI) that Reaper
> > > has.
> > >
> > >
> > > > I share Stefan's concern that consensus had not been met around a
> > > > side-car, and that it was somehow default accepted before a patch
> > landed.
> > >
> > >
> > > I feel this is not correct or fair. The sidecar and repair discussions
> > have
> > > been anything _but_ "default accepted". The timeline of consensus
> > building
> > > involving the management sidecar and repair scheduling plans:
> > >
> > > Dec 2016: Vinay worked with Jon and Alex to try to collaborate on
> Reaper
> > to
> > > come up with design goals for a repair scheduler that could work at
> > Netflix
> > > scale.
> > >
> > > ~Feb 2017: Netflix believes that the fundamental design gaps prevented
> us
> > > from using Reaper as it relies heavily on remote JMX connections and
> > > central coordination.
> > >
> > > Sep. 2017: Vinay gives a lightning talk at NGCC about a highly
> available
> > > and distributed repair scheduling sidecar/tool. He is encouraged by
> > > multiple committers to build repair scheduling into the daemon itself
> and
> > > not as a sidecar so the database is truly eventually consistent.
> > >
> > > ~Jun. 2017 - Feb. 2018: Based on internal need and the positive
> feedback
> > at
> > > NGCC, Vinay and myself prototype the distributed repair scheduler
> within
> > > Priam and roll it out at Netflix scale.
> > >
> > > Mar. 2018: I open a Jira (CASSANDRA-14346) along with a detailed 20
> page
> > > design document for adding repair scheduling to the daemon itself and
> > open
> > > the design up for feedback from the community. We get feedback from
> Alex,
> > > Blake, Nate, Stefan, and Mick. As far as I know there were zero
> proposals
> > > to contribute Reaper at this point. We hear the consensus that the
> > > community would prefer repair scheduling in a separate distributed
> > sidecar
> > > rather than in the daemon itself and we re-work the design to match
> this
> > > consensus, re-aligning with our original proposal at NGCC.
> > >
> > > Apr 2018: Blake brings the discussion of repair scheduling to the dev
> > list
> > > (
> > >
> > >
> >
> https://lists.apache.org/thread.html/760fbef677f27aa5c2ab4c375c7efeb81304fea428deff986ba1c2eb@%3Cdev.cassandra.apache.org%3E
> > > ).
> > > Many

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Blake Eggleston
I think we should accept the reaper project as is and make that cassandra 
management process 1.0, then integrate the netflix scheduler (and other new 
features) into that.

The ultimate goal would be for the netflix scheduler to become the default 
repair scheduler, but I think using reaper as the starting point makes it 
easier to get there. 

Reaper would bring a prod user base that would realistically take 2-3 years to 
build up with a new project. As an operator, switching to a cassandra 
management process that’s basically a re-brand of an existing and commonly used 
management process isn’t super risky. Asking operators to switch to a new 
process is a much harder sell. 

On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:

How can we continue moving this forward?  

Mick/Jon/TLP folks, is there a path here where we commit the  
Netflix-provided management process, and you augment Reaper to work with it?  
Is there a way we can make a larger umbrella that's modular that can  
support either/both?  
Does anyone believe there's a clear, objective argument that one is  
strictly better than the other? I haven't seen one.  



On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala  
 wrote:  

> +1 to everything that Joey articulated with emphasis on the fact that  
> contributions should be evaluated based on the merit of code and their  
> value add to the whole offering. I hope it does not matter whether that  
> contribution comes from PMC member or a person who is not a committer. I  
> would like the process to be such that it encourages the new members to be  
> a part of the community and not shy away from contributing to the code  
> assuming their contributions are valued differently than committers or PMC  
> members. It would be sad to see the contributions decrease if we go down  
> that path.  
>  
> *Regards,*  
>  
> *Roopa Tangirala*  
>  
> Engineering Manager CDE  
>  
> *(408) 438-3156 - mobile*  
>  
>  
>  
>  
>  
>  
> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch   
> wrote:  
>  
> > > We are looking to contribute Reaper to the Cassandra project.  
> > >  
> > Just to clarify are you proposing contributing Reaper as a project via  
> > donation or you are planning on contributing the features of Reaper as  
> > patches to Cassandra? If the former how far along are you on the donation  
> > process? If the latter, when do you think you would have patches ready  
> for  
> > consideration / review?  
> >  
> >  
> > > Looking at the patch it's very similar in its base design already, but  
> > > Reaper does has a lot more to offer. We have all been working hard to  
> > move  
> > > it to also being a side-car so it can be contributed. This raises a  
> > number  
> > > of relevant questions to this thread: would we then accept both works  
> in  
> > > the Cassandra project, and what burden would it put on the current PMC  
> to  
> > > maintain both works.  
> > >  
> > I would hope that we would collaborate on merging the best parts of all  
> > into the official Cassandra sidecar, taking the always on, shared  
> nothing,  
> > highly available system that we've contributed a patchset for and adding  
> in  
> > many of the repair features (e.g. schedules, a nice web UI) that Reaper  
> > has.  
> >  
> >  
> > > I share Stefan's concern that consensus had not been met around a  
> > > side-car, and that it was somehow default accepted before a patch  
> landed.  
> >  
> >  
> > I feel this is not correct or fair. The sidecar and repair discussions  
> have  
> > been anything _but_ "default accepted". The timeline of consensus  
> building  
> > involving the management sidecar and repair scheduling plans:  
> >  
> > Dec 2016: Vinay worked with Jon and Alex to try to collaborate on Reaper  
> to  
> > come up with design goals for a repair scheduler that could work at  
> Netflix  
> > scale.  
> >  
> > ~Feb 2017: Netflix believes that the fundamental design gaps prevented us  
> > from using Reaper as it relies heavily on remote JMX connections and  
> > central coordination.  
> >  
> > Sep. 2017: Vinay gives a lightning talk at NGCC about a highly available  
> > and distributed repair scheduling sidecar/tool. He is encouraged by  
> > multiple committers to build repair scheduling into the daemon itself and  
> > not as a sidecar so the database is truly eventually consistent.  
> >  
> > ~Jun. 2017 - Feb. 2018: Based on internal need and the positive feedback  
> at  
> > NGCC, Vinay and myself prototype the distributed repair scheduler within  
> > Priam and roll it out at Netflix scale.  
> >  
> > Mar. 2018: I open a Jira (CASSANDRA-14346) along with a detailed 20 page  
> > design document for adding repair scheduling to the daemon itself and  
> open  
> > the design up for feedback from the community. We get feedback from Alex,  
> > Blake, Nate, Stefan, and Mick. As far as I know there were zero proposals  
> > to contribute Reaper at this point. 

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Joseph Lynch
On Fri, Sep 7, 2018 at 5:03 PM Jonathan Haddad  wrote:
>
> We haven’t even defined any requirements for an admin tool. It’s hard to
> make a case for anything without agreement on what we’re trying to build.
>
We were/are trying to sketch out scope/requirements in the #14395 and
#14346 tickets as well as their associated design documents. I think
the general proposed direction is a distributed 1:1 management sidecar
process similar in architecture to Netflix's Priam except explicitly
built to be general and pluggable by anyone rather than tightly
coupled to AWS.

Dinesh, Vinay and I were aiming for low amounts of scope at first and
take things in an iterative approach with just enough upfront design
but not so much we are unable to make any progress at all. For example
maybe something like:

1. Get a super simple and non controversial sidecar process that ships
with Cassandra and exposes a lightweight HTTP interface to e.g. some
basic JMX endpoints
2a. Add a pluggable execution engine for cron/oneshot/scheduled jobs
with the basic interfaces and state store and such
2b. Start scoping and implementing the full HTTP interface, e.g.
backup status, cluster health status, etc ...
3a. Start integrating implementations of the jobs from 2a such as
snapshot, backup, cluster restart, daemon + sstable upgrade, repair,
etc
3b. Start integrating UI components that pair with the HTTP interface from 2b
4. ?? Perhaps start unlocking next generation operations like moving
"background" activities like compaction, streaming, repair etc into
one or more sidecar contained processes to ensure the main daemon only
handles read+write requests

There are going to be a lot of questions to answer, and I think trying
to answer them all up front will mean that we get nowhere or make
unfortunate compromises that cripple the project from the start. If
people think we need to do more design and discussion than we have
been doing then we can spend more time on the design, but personally
I'd rather start iterating on code and prove value incrementally. If
it doesn't work out we won't release it GA to the community ...

-Joey

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Jeff Jirsa
I’d also like to see the end state you describe: reaper UI wrapping the Netflix 
management process with pluggable scheduling (either as is with reaper now, or 
using the Netflix scheduler), but I don’t think that means we need to start 
with reaper - if personally prefer the opposite direction, starting with 
something small and isolated and layering on top. 

-- 
Jeff Jirsa


> On Sep 7, 2018, at 5:42 PM, Blake Eggleston  wrote:
> 
> I think we should accept the reaper project as is and make that cassandra 
> management process 1.0, then integrate the netflix scheduler (and other new 
> features) into that.
> 
> The ultimate goal would be for the netflix scheduler to become the default 
> repair scheduler, but I think using reaper as the starting point makes it 
> easier to get there. 
> 
> Reaper would bring a prod user base that would realistically take 2-3 years 
> to build up with a new project. As an operator, switching to a cassandra 
> management process that’s basically a re-brand of an existing and commonly 
> used management process isn’t super risky. Asking operators to switch to a 
> new process is a much harder sell. 
> 
> On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:
> 
> How can we continue moving this forward?  
> 
> Mick/Jon/TLP folks, is there a path here where we commit the  
> Netflix-provided management process, and you augment Reaper to work with it?  
> Is there a way we can make a larger umbrella that's modular that can  
> support either/both?  
> Does anyone believe there's a clear, objective argument that one is  
> strictly better than the other? I haven't seen one.  
> 
> 
> 
> On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala  
>  wrote:  
> 
>> +1 to everything that Joey articulated with emphasis on the fact that  
>> contributions should be evaluated based on the merit of code and their  
>> value add to the whole offering. I hope it does not matter whether that  
>> contribution comes from PMC member or a person who is not a committer. I  
>> would like the process to be such that it encourages the new members to be  
>> a part of the community and not shy away from contributing to the code  
>> assuming their contributions are valued differently than committers or PMC  
>> members. It would be sad to see the contributions decrease if we go down  
>> that path.  
>> 
>> *Regards,*  
>> 
>> *Roopa Tangirala*  
>> 
>> Engineering Manager CDE  
>> 
>> *(408) 438-3156 - mobile*  
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch   
>> wrote:  
>> 
 We are looking to contribute Reaper to the Cassandra project.  
 
>>> Just to clarify are you proposing contributing Reaper as a project via  
>>> donation or you are planning on contributing the features of Reaper as  
>>> patches to Cassandra? If the former how far along are you on the donation  
>>> process? If the latter, when do you think you would have patches ready  
>> for  
>>> consideration / review?  
>>> 
>>> 
 Looking at the patch it's very similar in its base design already, but  
 Reaper does has a lot more to offer. We have all been working hard to  
>>> move  
 it to also being a side-car so it can be contributed. This raises a  
>>> number  
 of relevant questions to this thread: would we then accept both works  
>> in  
 the Cassandra project, and what burden would it put on the current PMC  
>> to  
 maintain both works.  
 
>>> I would hope that we would collaborate on merging the best parts of all  
>>> into the official Cassandra sidecar, taking the always on, shared  
>> nothing,  
>>> highly available system that we've contributed a patchset for and adding  
>> in  
>>> many of the repair features (e.g. schedules, a nice web UI) that Reaper  
>>> has.  
>>> 
>>> 
 I share Stefan's concern that consensus had not been met around a  
 side-car, and that it was somehow default accepted before a patch  
>> landed.  
>>> 
>>> 
>>> I feel this is not correct or fair. The sidecar and repair discussions  
>> have  
>>> been anything _but_ "default accepted". The timeline of consensus  
>> building  
>>> involving the management sidecar and repair scheduling plans:  
>>> 
>>> Dec 2016: Vinay worked with Jon and Alex to try to collaborate on Reaper  
>> to  
>>> come up with design goals for a repair scheduler that could work at  
>> Netflix  
>>> scale.  
>>> 
>>> ~Feb 2017: Netflix believes that the fundamental design gaps prevented us  
>>> from using Reaper as it relies heavily on remote JMX connections and  
>>> central coordination.  
>>> 
>>> Sep. 2017: Vinay gives a lightning talk at NGCC about a highly available  
>>> and distributed repair scheduling sidecar/tool. He is encouraged by  
>>> multiple committers to build repair scheduling into the daemon itself and  
>>> not as a sidecar so the database is truly eventually consistent.  
>>> 
>>> ~Jun. 2017 - Feb. 2018: Based on internal need and the positive feedback  
>>

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Mick Semb Wever


> How can we continue moving this forward?
> 
> Mick/Jon/TLP folks, is there a path here where we commit the
> Netflix-provided management process, and you augment Reaper to work with it?
> Is there a way we can make a larger umbrella that's modular that can
> support either/both?


There seems a reluctance in any collaboration between Reaper and Netflix atm, 
despite efforts from both sides. It worries me that there appears to be this 
whole roadmap of design and implementation ready to go, and reference work to 
it that we don't get to see. When we (Reaper) do have a real reference work out 
in the open. This will ofc make it challenging for the Reaper folk to get 
involved, it leaves us with the feeling that we're just having to re-invent the 
wheel for someone else, when it would be much quicker for us to evolve Reaper 
to the designs proposed.

Such collaboration between us, i reckon needs to happen first ("show, don't 
tell"). So I'd throw out the idea that we start the side-car collaboration 
project as an external github project. From there if we can put something 
together that satisfies people's designs for a new side-car tool, and maintains 
the Reaper user base and feature-list, then we've proven that the 
community-spirit and trust is in place to bring it into the Cassandra project. 
Also, unless we have existing PMC that we know we guardian this project we're 
just presuming everyone will know all the ins and outs of ASF processes and 
requirements. Maybe it's better to be practicing this stuff first, rather than 
presuming it will just happen. 

That's just my two cents, and an attempt at a compromise. My first vote would 
still be to bring in Reaper and retro-fit it step by step (and i'd be more than 
happy to accept and work on someone else's roadmap if we took this approach).

Mick

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Blake Eggleston
What’s the benefit of doing it that way vs starting with reaper and integrating 
the netflix scheduler? If reaper was just a really inappropriate choice for the 
cassandra management process, I could see that being a better approach, but I 
don’t think that’s the case.

If our management process isn’t a drop in replacement for reaper, then reaper 
will continue to exist, which will split the user and developers base between 
the 2 projects. That won't be good for either project.

On September 7, 2018 at 6:12:01 PM, Jeff Jirsa (jji...@gmail.com) wrote:

I’d also like to see the end state you describe: reaper UI wrapping the Netflix 
management process with pluggable scheduling (either as is with reaper now, or 
using the Netflix scheduler), but I don’t think that means we need to start 
with reaper - if personally prefer the opposite direction, starting with 
something small and isolated and layering on top.  

--  
Jeff Jirsa  


> On Sep 7, 2018, at 5:42 PM, Blake Eggleston  wrote:  
>  
> I think we should accept the reaper project as is and make that cassandra 
> management process 1.0, then integrate the netflix scheduler (and other new 
> features) into that.  
>  
> The ultimate goal would be for the netflix scheduler to become the default 
> repair scheduler, but I think using reaper as the starting point makes it 
> easier to get there.  
>  
> Reaper would bring a prod user base that would realistically take 2-3 years 
> to build up with a new project. As an operator, switching to a cassandra 
> management process that’s basically a re-brand of an existing and commonly 
> used management process isn’t super risky. Asking operators to switch to a 
> new process is a much harder sell.  
>  
> On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:  
>  
> How can we continue moving this forward?  
>  
> Mick/Jon/TLP folks, is there a path here where we commit the  
> Netflix-provided management process, and you augment Reaper to work with it?  
> Is there a way we can make a larger umbrella that's modular that can  
> support either/both?  
> Does anyone believe there's a clear, objective argument that one is  
> strictly better than the other? I haven't seen one.  
>  
>  
>  
> On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala  
>  wrote:  
>  
>> +1 to everything that Joey articulated with emphasis on the fact that  
>> contributions should be evaluated based on the merit of code and their  
>> value add to the whole offering. I hope it does not matter whether that  
>> contribution comes from PMC member or a person who is not a committer. I  
>> would like the process to be such that it encourages the new members to be  
>> a part of the community and not shy away from contributing to the code  
>> assuming their contributions are valued differently than committers or PMC  
>> members. It would be sad to see the contributions decrease if we go down  
>> that path.  
>>  
>> *Regards,*  
>>  
>> *Roopa Tangirala*  
>>  
>> Engineering Manager CDE  
>>  
>> *(408) 438-3156 - mobile*  
>>  
>>  
>>  
>>  
>>  
>>  
>> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch   
>> wrote:  
>>  
 We are looking to contribute Reaper to the Cassandra project.  
  
>>> Just to clarify are you proposing contributing Reaper as a project via  
>>> donation or you are planning on contributing the features of Reaper as  
>>> patches to Cassandra? If the former how far along are you on the donation  
>>> process? If the latter, when do you think you would have patches ready  
>> for  
>>> consideration / review?  
>>>  
>>>  
 Looking at the patch it's very similar in its base design already, but  
 Reaper does has a lot more to offer. We have all been working hard to  
>>> move  
 it to also being a side-car so it can be contributed. This raises a  
>>> number  
 of relevant questions to this thread: would we then accept both works  
>> in  
 the Cassandra project, and what burden would it put on the current PMC  
>> to  
 maintain both works.  
  
>>> I would hope that we would collaborate on merging the best parts of all  
>>> into the official Cassandra sidecar, taking the always on, shared  
>> nothing,  
>>> highly available system that we've contributed a patchset for and adding  
>> in  
>>> many of the repair features (e.g. schedules, a nice web UI) that Reaper  
>>> has.  
>>>  
>>>  
 I share Stefan's concern that consensus had not been met around a  
 side-car, and that it was somehow default accepted before a patch  
>> landed.  
>>>  
>>>  
>>> I feel this is not correct or fair. The sidecar and repair discussions  
>> have  
>>> been anything _but_ "default accepted". The timeline of consensus  
>> building  
>>> involving the management sidecar and repair scheduling plans:  
>>>  
>>> Dec 2016: Vinay worked with Jon and Alex to try to collaborate on Reaper  
>> to  
>>> come up with design goals for a repair scheduler that could work at  
>> Netfl

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Vinay Chella
 > I think we should accept the reaper project as is and make that
cassandra management process 1.0, then integrate the Netflix scheduler (and
other new features) into that.
Integrating Netflix scheduler into reaper is mostly refactoring reaper code
since they are different architectures.

> Reaper would bring a prod user base that would realistically take 2-3
years to build up with a new project.
IMO, it is great if we have that, but this should not be the deciding factor

> As an operator, switching to a cassandra management process that’s
basically a re-brand of an existing and commonly used management process
isn’t super risky. Asking operators to switch to a new process is a much
harder sell.
Reaper is far away from becoming a "cassandra management process", I
understand it does its job in doing repairs and snapshots (and other things
if I missed any), but as a Cassandra mangement sidecar process,
responsibilities of it are far beyond those just mentioned. All the design
goals mentioned in this thread from Joey (pluggable execution engine,
backup, restore, ring health detection, sstable upgrade, rolling restarts,
topology-aware maintenance, replacements of entire fleet without
compromising availability etc.,) are critical operations of a "cassandra
management process" that are hard to add on to a system which is not
architected for that. It basically makes total rework/refactor of reaper,
if we were to go down that path. And don't get me wrong, Reaper is a great
repair tool available for C* community, with a great visualization which
makes it easy to use.

We prefer what Jeff proposes, starting with something small and isolated
and layering the best of all sidecars incrementally on top.

--Vinay Chella


On Fri, Sep 7, 2018 at 6:11 PM Jeff Jirsa  wrote:

> I’d also like to see the end state you describe: reaper UI wrapping the
> Netflix management process with pluggable scheduling (either as is with
> reaper now, or using the Netflix scheduler), but I don’t think that means
> we need to start with reaper - if personally prefer the opposite direction,
> starting with something small and isolated and layering on top.
>
> --
> Jeff Jirsa
>
>
> > On Sep 7, 2018, at 5:42 PM, Blake Eggleston 
> wrote:
> >
> > I think we should accept the reaper project as is and make that
> cassandra management process 1.0, then integrate the netflix scheduler (and
> other new features) into that.
> >
> > The ultimate goal would be for the netflix scheduler to become the
> default repair scheduler, but I think using reaper as the starting point
> makes it easier to get there.
> >
> > Reaper would bring a prod user base that would realistically take 2-3
> years to build up with a new project. As an operator, switching to a
> cassandra management process that’s basically a re-brand of an existing and
> commonly used management process isn’t super risky. Asking operators to
> switch to a new process is a much harder sell.
> >
> > On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:
> >
> > How can we continue moving this forward?
> >
> > Mick/Jon/TLP folks, is there a path here where we commit the
> > Netflix-provided management process, and you augment Reaper to work with
> it?
> > Is there a way we can make a larger umbrella that's modular that can
> > support either/both?
> > Does anyone believe there's a clear, objective argument that one is
> > strictly better than the other? I haven't seen one.
> >
> >
> >
> > On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala
> >  wrote:
> >
> >> +1 to everything that Joey articulated with emphasis on the fact that
> >> contributions should be evaluated based on the merit of code and their
> >> value add to the whole offering. I hope it does not matter whether
> that
> >> contribution comes from PMC member or a person who is not a committer.
> I
> >> would like the process to be such that it encourages the new members to
> be
> >> a part of the community and not shy away from contributing to the code
> >> assuming their contributions are valued differently than committers or
> PMC
> >> members. It would be sad to see the contributions decrease if we go
> down
> >> that path.
> >>
> >> *Regards,*
> >>
> >> *Roopa Tangirala*
> >>
> >> Engineering Manager CDE
> >>
> >> *(408) 438-3156 - mobile*
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch 
> >> wrote:
> >>
>  We are looking to contribute Reaper to the Cassandra project.
> 
> >>> Just to clarify are you proposing contributing Reaper as a project
> via
> >>> donation or you are planning on contributing the features of Reaper
> as
> >>> patches to Cassandra? If the former how far along are you on the
> donation
> >>> process? If the latter, when do you think you would have patches
> ready
> >> for
> >>> consideration / review?
> >>>
> >>>
>  Looking at the patch it's very similar in its base design already,
> but
>  Reaper does has a lot more to offer. We have all been workin

Re: QA signup

2018-09-07 Thread Mick Semb Wever


> Periodic SNAPSHOT builds sounds great. I'd feel much better about builds 
> published as date- or SHA-stamped snapshots / nightlies rather than 
> calling them alphas at this point, as everyone's testing work is 
> beginning. Can someone offer details on what would need to be done to 
> publish snapshots or nightlies in the context of Apache build 
> infrastructure?


Yes.

For the maven artefacts, timestamped snapshots (or nightlies) can be made 
available through
  https://repository.apache.org/content/repositories/snapshots/

For the downloadable binary and source artefacts the staging area can be used 
for nightlies
  https://dist.apache.org/repos/dist/dev/cassandra/


(This staging area is now what is supposed to be used for staging releases, 
instead of people.apache.org which was the old ASF way of doing it. I wrote 
some of this up in the recent release docs, which are still wip: 
https://github.com/apache/cassandra/blob/trunk/doc/source/development/release_process.rst
 )

Mick

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Joseph Lynch
> What’s the benefit of doing it that way vs starting with reaper and 
> integrating the netflix scheduler? If reaper was just a really inappropriate 
> choice for the cassandra management process, I could see that being a better 
> approach, but I don’t think that’s the case.
>
The benefit, as Dinesh and I argued earlier, is starting without
technical debt (especially architectural technical debt) and taking
only the best components from the multiple community sidecars for the
Cassandra management sidecar. To be clear, I think Priam is much
closer to the proposed management sidecar than Reaper is (and Priam +
the repair scheduler has basically all proposed scope), but like I
said earlier in the other thread I don't think we should take Priam as
is due to technical debt and I don't think we should take Reaper
either. The community should learn from the many sidecars we've built
and solve the problem once inside the Cassandra sidecar.

> If our management process isn’t a drop in replacement for reaper, then reaper 
> will continue to exist, which will split the user and developers base between 
> the 2 projects. That won't be good for either project.
I think Reaper is a great repair tool for some infrastructures, but I
don't think the management sidecar is about running repairs. It's
about building a general purpose tool that may happen to run repairs
if someone chooses to use that particular plugin. To be honest I think
it's great that there are competing community repair tools ... this is
how we learn from all of them and build the simplest and most narrowly
tailored solution into the database itself...

-Joey

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Jeff Jirsa
The benefit is that it more closely matched the design doc, from 5 months ago, 
which is decidedly not about coordinating repair - it’s about a general purpose 
management tool, where repair is one of many proposed tasks

https://docs.google.com/document/d/1UV9pE81NaIUF3g4L1wxq09nT11AkSQcMijgLFwGsY3s/edit


By starting with a tool that is built to run repair, you’re sacrificing 
generality and accepting something purpose built for one sub task. It’s an 
important subtask, and it’s a nice tool, but it’s not an implementation of the 
proposal, it’s an alternative that happens to do some of what was proposed.

-- 
Jeff Jirsa


> On Sep 7, 2018, at 6:53 PM, Blake Eggleston  wrote:
> 
> What’s the benefit of doing it that way vs starting with reaper and 
> integrating the netflix scheduler? If reaper was just a really inappropriate 
> choice for the cassandra management process, I could see that being a better 
> approach, but I don’t think that’s the case.
> 
> If our management process isn’t a drop in replacement for reaper, then reaper 
> will continue to exist, which will split the user and developers base between 
> the 2 projects. That won't be good for either project.
> 
> On September 7, 2018 at 6:12:01 PM, Jeff Jirsa (jji...@gmail.com) wrote:
> 
> I’d also like to see the end state you describe: reaper UI wrapping the 
> Netflix management process with pluggable scheduling (either as is with 
> reaper now, or using the Netflix scheduler), but I don’t think that means we 
> need to start with reaper - if personally prefer the opposite direction, 
> starting with something small and isolated and layering on top.  
> 
> --  
> Jeff Jirsa  
> 
> 
>> On Sep 7, 2018, at 5:42 PM, Blake Eggleston  wrote:  
>> 
>> I think we should accept the reaper project as is and make that cassandra 
>> management process 1.0, then integrate the netflix scheduler (and other new 
>> features) into that.  
>> 
>> The ultimate goal would be for the netflix scheduler to become the default 
>> repair scheduler, but I think using reaper as the starting point makes it 
>> easier to get there.  
>> 
>> Reaper would bring a prod user base that would realistically take 2-3 years 
>> to build up with a new project. As an operator, switching to a cassandra 
>> management process that’s basically a re-brand of an existing and commonly 
>> used management process isn’t super risky. Asking operators to switch to a 
>> new process is a much harder sell.  
>> 
>> On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:  
>> 
>> How can we continue moving this forward?  
>> 
>> Mick/Jon/TLP folks, is there a path here where we commit the  
>> Netflix-provided management process, and you augment Reaper to work with it? 
>>  
>> Is there a way we can make a larger umbrella that's modular that can  
>> support either/both?  
>> Does anyone believe there's a clear, objective argument that one is  
>> strictly better than the other? I haven't seen one.  
>> 
>> 
>> 
>> On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala  
>>  wrote:  
>> 
>>> +1 to everything that Joey articulated with emphasis on the fact that  
>>> contributions should be evaluated based on the merit of code and their  
>>> value add to the whole offering. I hope it does not matter whether that  
>>> contribution comes from PMC member or a person who is not a committer. I  
>>> would like the process to be such that it encourages the new members to be  
>>> a part of the community and not shy away from contributing to the code  
>>> assuming their contributions are valued differently than committers or PMC  
>>> members. It would be sad to see the contributions decrease if we go down  
>>> that path.  
>>> 
>>> *Regards,*  
>>> 
>>> *Roopa Tangirala*  
>>> 
>>> Engineering Manager CDE  
>>> 
>>> *(408) 438-3156 - mobile*  
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch   
>>> wrote:  
>>> 
> We are looking to contribute Reaper to the Cassandra project.  
> 
 Just to clarify are you proposing contributing Reaper as a project via  
 donation or you are planning on contributing the features of Reaper as  
 patches to Cassandra? If the former how far along are you on the donation  
 process? If the latter, when do you think you would have patches ready  
>>> for  
 consideration / review?  
 
 
> Looking at the patch it's very similar in its base design already, but  
> Reaper does has a lot more to offer. We have all been working hard to  
 move  
> it to also being a side-car so it can be contributed. This raises a  
 number  
> of relevant questions to this thread: would we then accept both works  
>>> in  
> the Cassandra project, and what burden would it put on the current PMC  
>>> to  
> maintain both works.  
> 
 I would hope that we would collaborate on merging the best parts of all  
 into the official Cassandra sidecar, taking the always on, shared 

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Blake Eggleston
Right, I understand the arguments for starting a new project. I’m not saying 
reaper is, technically speaking, the best place to start. The point I’m trying 
to make is that the non-technical advantages of using an existing project as a 
starting point may outweigh the technical benefits of a clean slate. Whether 
that’s the case or not, it’s not a strictly technical decision, and the 
non-technical advantages of starting with reaper need to be weighed.

On September 7, 2018 at 8:19:50 PM, Jeff Jirsa (jji...@gmail.com) wrote:

The benefit is that it more closely matched the design doc, from 5 months ago, 
which is decidedly not about coordinating repair - it’s about a general purpose 
management tool, where repair is one of many proposed tasks  

https://docs.google.com/document/d/1UV9pE81NaIUF3g4L1wxq09nT11AkSQcMijgLFwGsY3s/edit
  


By starting with a tool that is built to run repair, you’re sacrificing 
generality and accepting something purpose built for one sub task. It’s an 
important subtask, and it’s a nice tool, but it’s not an implementation of the 
proposal, it’s an alternative that happens to do some of what was proposed.  

--  
Jeff Jirsa  


> On Sep 7, 2018, at 6:53 PM, Blake Eggleston  wrote:  
>  
> What’s the benefit of doing it that way vs starting with reaper and 
> integrating the netflix scheduler? If reaper was just a really inappropriate 
> choice for the cassandra management process, I could see that being a better 
> approach, but I don’t think that’s the case.  
>  
> If our management process isn’t a drop in replacement for reaper, then reaper 
> will continue to exist, which will split the user and developers base between 
> the 2 projects. That won't be good for either project.  
>  
> On September 7, 2018 at 6:12:01 PM, Jeff Jirsa (jji...@gmail.com) wrote:  
>  
> I’d also like to see the end state you describe: reaper UI wrapping the 
> Netflix management process with pluggable scheduling (either as is with 
> reaper now, or using the Netflix scheduler), but I don’t think that means we 
> need to start with reaper - if personally prefer the opposite direction, 
> starting with something small and isolated and layering on top.  
>  
> --  
> Jeff Jirsa  
>  
>  
>> On Sep 7, 2018, at 5:42 PM, Blake Eggleston  wrote:  
>>  
>> I think we should accept the reaper project as is and make that cassandra 
>> management process 1.0, then integrate the netflix scheduler (and other new 
>> features) into that.  
>>  
>> The ultimate goal would be for the netflix scheduler to become the default 
>> repair scheduler, but I think using reaper as the starting point makes it 
>> easier to get there.  
>>  
>> Reaper would bring a prod user base that would realistically take 2-3 years 
>> to build up with a new project. As an operator, switching to a cassandra 
>> management process that’s basically a re-brand of an existing and commonly 
>> used management process isn’t super risky. Asking operators to switch to a 
>> new process is a much harder sell.  
>>  
>> On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:  
>>  
>> How can we continue moving this forward?  
>>  
>> Mick/Jon/TLP folks, is there a path here where we commit the  
>> Netflix-provided management process, and you augment Reaper to work with it? 
>>  
>> Is there a way we can make a larger umbrella that's modular that can  
>> support either/both?  
>> Does anyone believe there's a clear, objective argument that one is  
>> strictly better than the other? I haven't seen one.  
>>  
>>  
>>  
>> On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala  
>>  wrote:  
>>  
>>> +1 to everything that Joey articulated with emphasis on the fact that  
>>> contributions should be evaluated based on the merit of code and their  
>>> value add to the whole offering. I hope it does not matter whether that  
>>> contribution comes from PMC member or a person who is not a committer. I  
>>> would like the process to be such that it encourages the new members to be  
>>> a part of the community and not shy away from contributing to the code  
>>> assuming their contributions are valued differently than committers or PMC  
>>> members. It would be sad to see the contributions decrease if we go down  
>>> that path.  
>>>  
>>> *Regards,*  
>>>  
>>> *Roopa Tangirala*  
>>>  
>>> Engineering Manager CDE  
>>>  
>>> *(408) 438-3156 - mobile*  
>>>  
>>>  
>>>  
>>>  
>>>  
>>>  
>>> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch   
>>> wrote:  
>>>  
> We are looking to contribute Reaper to the Cassandra project.  
>  
 Just to clarify are you proposing contributing Reaper as a project via  
 donation or you are planning on contributing the features of Reaper as  
 patches to Cassandra? If the former how far along are you on the donation  
 process? If the latter, when do you think you would have patches ready  
>>> for  
 consideration / review?  
  
  
> Looking at the patch it's very s