[DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Benjamin Lerer
CASSANDRA-12126 addresses one correctness issue of Light Weight
Transactions. Unfortunately, the current patch developed by Sylvain and
Benedict requires an extra round trip between the coordinator and the
replicas for SERIAL and LOCAL_SERIAL reads.
After some experimentations, Benedict discovered that this extra round trip
could lead to a significant increase in timeouts for read-heavy workloads.

Users for which this behavior is a problem will be able to switch back to
the old behavior using a system property, therefore choosing performance
versus correctness.

On the side, Benedict has worked on another approach that does not suffer
from that performance problem and also addresses some LWT correctness
issues that can happen when adding or removing nodes. He initially intended
to deliver that improvement in 4.X but can try to incorporate it into 4.0.

Regarding CASSANDRA-12126 and 4.0 we are facing several options and
Benedict, Sylvain and I wanted to get the community feedback on them.

We can:

   1. Try to use Benedict proposal for 4.0 if the community has the
   appetite for it. The main issue there is some potential extra delay for 4.0
   2. Do nothing for 4.0. Meaning do not commit the current patch. We have
   lived a long time with that issue and we can probably wait a bit more for a
   proper solution.
   3. Commit the patch as such, fixing the correctness but introducing
   potentially some performance issue until we release a better solution.
   4. Changing the patch to default to the current behavior but allowing
   people to enable the new one if the correctness is a problem for them.

  Thanks in advance for your feedback.


Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Joshua McKenzie
How old is the C-12126 surfaced defect? i.e. is this a thing we've had
since initial introduction of paxos or is it a regression we introduced
somewhere along the way?

On Wed, Nov 11, 2020 at 11:03 AM Benjamin Lerer 
wrote:

> CASSANDRA-12126 addresses one correctness issue of Light Weight
> Transactions. Unfortunately, the current patch developed by Sylvain and
> Benedict requires an extra round trip between the coordinator and the
> replicas for SERIAL and LOCAL_SERIAL reads.
> After some experimentations, Benedict discovered that this extra round trip
> could lead to a significant increase in timeouts for read-heavy workloads.
>
> Users for which this behavior is a problem will be able to switch back to
> the old behavior using a system property, therefore choosing performance
> versus correctness.
>
> On the side, Benedict has worked on another approach that does not suffer
> from that performance problem and also addresses some LWT correctness
> issues that can happen when adding or removing nodes. He initially intended
> to deliver that improvement in 4.X but can try to incorporate it into 4.0.
>
> Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> Benedict, Sylvain and I wanted to get the community feedback on them.
>
> We can:
>
>1. Try to use Benedict proposal for 4.0 if the community has the
>appetite for it. The main issue there is some potential extra delay for
> 4.0
>2. Do nothing for 4.0. Meaning do not commit the current patch. We have
>lived a long time with that issue and we can probably wait a bit more
> for a
>proper solution.
>3. Commit the patch as such, fixing the correctness but introducing
>potentially some performance issue until we release a better solution.
>4. Changing the patch to default to the current behavior but allowing
>people to enable the new one if the correctness is a problem for them.
>
>   Thanks in advance for your feedback.
>


Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Benedict Elliott Smith
It's been there since the beginning.

If we were to consider the alternative proposal for 4.0, it would not have to 
be blocking for release. I had planned to come forward after 4.0, primarily 
because I did not want to create further political complexities for the project 
at this time, but also because I do not presently have the time to produce all 
of the documentation we might like for such a proposal. However, the work is 
ready, has already been reviewed by multiple committers, has had more extensive 
testing than any feature I'm aware of to date, and could be made available for 
4.0 in fairly short order. While the work itself is non-trivial, the work to 
integrate it is not complex.  It would also be optional, and configurable at 
runtime.

The only likely blocker would be the process of review, and any other due 
diligence the project might want to undertake.  Absolutely not something I 
advocate for or against an accelerated timescale on.  I have no personal 
preference for the approach taken, just providing this for context.


On 11/11/2020, 16:18, "Joshua McKenzie"  wrote:

How old is the C-12126 surfaced defect? i.e. is this a thing we've had
since initial introduction of paxos or is it a regression we introduced
somewhere along the way?

On Wed, Nov 11, 2020 at 11:03 AM Benjamin Lerer 

wrote:

> CASSANDRA-12126 addresses one correctness issue of Light Weight
> Transactions. Unfortunately, the current patch developed by Sylvain and
> Benedict requires an extra round trip between the coordinator and the
> replicas for SERIAL and LOCAL_SERIAL reads.
> After some experimentations, Benedict discovered that this extra round 
trip
> could lead to a significant increase in timeouts for read-heavy workloads.
>
> Users for which this behavior is a problem will be able to switch back to
> the old behavior using a system property, therefore choosing performance
> versus correctness.
>
> On the side, Benedict has worked on another approach that does not suffer
> from that performance problem and also addresses some LWT correctness
> issues that can happen when adding or removing nodes. He initially 
intended
> to deliver that improvement in 4.X but can try to incorporate it into 4.0.
>
> Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> Benedict, Sylvain and I wanted to get the community feedback on them.
>
> We can:
>
>1. Try to use Benedict proposal for 4.0 if the community has the
>appetite for it. The main issue there is some potential extra delay for
> 4.0
>2. Do nothing for 4.0. Meaning do not commit the current patch. We have
>lived a long time with that issue and we can probably wait a bit more
> for a
>proper solution.
>3. Commit the patch as such, fixing the correctness but introducing
>potentially some performance issue until we release a better solution.
>4. Changing the patch to default to the current behavior but allowing
>people to enable the new one if the correctness is a problem for them.
>
>   Thanks in advance for your feedback.
>



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Joshua McKenzie
Got it.

Thanks for the extra context.

No real opinion here. :)

On Wed, Nov 11, 2020 at 11:29 AM Benedict Elliott Smith 
wrote:

> It's been there since the beginning.
>
> If we were to consider the alternative proposal for 4.0, it would not have
> to be blocking for release. I had planned to come forward after 4.0,
> primarily because I did not want to create further political complexities
> for the project at this time, but also because I do not presently have the
> time to produce all of the documentation we might like for such a proposal.
> However, the work is ready, has already been reviewed by multiple
> committers, has had more extensive testing than any feature I'm aware of to
> date, and could be made available for 4.0 in fairly short order. While the
> work itself is non-trivial, the work to integrate it is not complex.  It
> would also be optional, and configurable at runtime.
>
> The only likely blocker would be the process of review, and any other due
> diligence the project might want to undertake.  Absolutely not something I
> advocate for or against an accelerated timescale on.  I have no personal
> preference for the approach taken, just providing this for context.
>
>
> On 11/11/2020, 16:18, "Joshua McKenzie"  wrote:
>
> How old is the C-12126 surfaced defect? i.e. is this a thing we've had
> since initial introduction of paxos or is it a regression we introduced
> somewhere along the way?
>
> On Wed, Nov 11, 2020 at 11:03 AM Benjamin Lerer <
> benjamin.le...@datastax.com>
> wrote:
>
> > CASSANDRA-12126 addresses one correctness issue of Light Weight
> > Transactions. Unfortunately, the current patch developed by Sylvain
> and
> > Benedict requires an extra round trip between the coordinator and the
> > replicas for SERIAL and LOCAL_SERIAL reads.
> > After some experimentations, Benedict discovered that this extra
> round trip
> > could lead to a significant increase in timeouts for read-heavy
> workloads.
> >
> > Users for which this behavior is a problem will be able to switch
> back to
> > the old behavior using a system property, therefore choosing
> performance
> > versus correctness.
> >
> > On the side, Benedict has worked on another approach that does not
> suffer
> > from that performance problem and also addresses some LWT correctness
> > issues that can happen when adding or removing nodes. He initially
> intended
> > to deliver that improvement in 4.X but can try to incorporate it
> into 4.0.
> >
> > Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> > Benedict, Sylvain and I wanted to get the community feedback on them.
> >
> > We can:
> >
> >1. Try to use Benedict proposal for 4.0 if the community has the
> >appetite for it. The main issue there is some potential extra
> delay for
> > 4.0
> >2. Do nothing for 4.0. Meaning do not commit the current patch.
> We have
> >lived a long time with that issue and we can probably wait a bit
> more
> > for a
> >proper solution.
> >3. Commit the patch as such, fixing the correctness but
> introducing
> >potentially some performance issue until we release a better
> solution.
> >4. Changing the patch to default to the current behavior but
> allowing
> >people to enable the new one if the correctness is a problem for
> them.
> >
> >   Thanks in advance for your feedback.
> >
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Michael Semb Wever


> Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> Benedict, Sylvain and I wanted to get the community feedback on them.
> 
> We can:
> 
>1. Try to use Benedict proposal for 4.0 if the community has the
>appetite for it. The main issue there is some potential extra delay for 4.0
>2. Do nothing for 4.0. Meaning do not commit the current patch. We have
>lived a long time with that issue and we can probably wait a bit more for a
>proper solution.
>3. Commit the patch as such, fixing the correctness but introducing
>potentially some performance issue until we release a better solution.
>4. Changing the patch to default to the current behavior but allowing
>people to enable the new one if the correctness is a problem for them.
> 


If these options are for 4.0, is it then (4) that it is getting applied to 3.0 
and 3.11 ?

If that is the case then I would vote on also applying (4) to 4.0, given we are 
now in front of beta4. Please let's not further delay 4.0.

Post 4.0, if (1) is as described "a parallel implementation of the same 
underlying Paxos algorithm" can it also pluggable (either opt-in or opt-out)? 
And would/could EPaxos become pluggable too in a similar manner (if it 
eventuates)? I'm in favour on providing more pluggable interfaces into C*, 
along with the code quality improvements that's going to have to be accompanied 
with. 



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Benedict Elliott Smith
In my opinion, a similar calculus should be applied to 3.0 and 3.11.  This is 
a(n arguably quite serious) bug, so whatever is not overly onerous to backport 
should be considered while they are supported. The work under discussion has 
two components: a replacement to the core consensus algorithm, and mechanisms 
to ensure safety across range movements. The latter might be more invasive for 
3.x, but the former should be quite easy to backport and as such probably quite 
well justified.

> can it also pluggable (either opt-in or opt-out)?

I think pluggable means something different to opt-in/opt-out, at least to me.  
I'm all for more pluggability, and also for more optionality, but the decision 
is very sensitive to context. We need to be able to select between our options, 
which for consensus practically means supporting live migration - which is 
exceptionally challenging in any general sense (and perhaps inherently 
non-pluggable).

As to future development for consensus, I personally hope the work we are 
discussing here will be a strong platform for it, but obviously that's for the 
community to decide later on. I think the work to take it forwards to something 
epaxos-like will not be that herculean, with some incremental milestones en 
route. But that's a totally different discussion for the future, and either a 
CEP or a small intercollegiate working group.


On 11/11/2020, 18:48, "Michael Semb Wever"  wrote:


> Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> Benedict, Sylvain and I wanted to get the community feedback on them.
> 
> We can:
> 
>1. Try to use Benedict proposal for 4.0 if the community has the
>appetite for it. The main issue there is some potential extra delay 
for 4.0
>2. Do nothing for 4.0. Meaning do not commit the current patch. We have
>lived a long time with that issue and we can probably wait a bit more 
for a
>proper solution.
>3. Commit the patch as such, fixing the correctness but introducing
>potentially some performance issue until we release a better solution.
>4. Changing the patch to default to the current behavior but allowing
>people to enable the new one if the correctness is a problem for them.
> 


If these options are for 4.0, is it then (4) that it is getting applied to 
3.0 and 3.11 ?

If that is the case then I would vote on also applying (4) to 4.0, given we 
are now in front of beta4. Please let's not further delay 4.0.

Post 4.0, if (1) is as described "a parallel implementation of the same 
underlying Paxos algorithm" can it also pluggable (either opt-in or opt-out)? 
And would/could EPaxos become pluggable too in a similar manner (if it 
eventuates)? I'm in favour on providing more pluggable interfaces into C*, 
along with the code quality improvements that's going to have to be accompanied 
with. 



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-11 Thread Sumanth Pasupuleti
Knowing there is a correctness issue in LWT, and given users use LWT
primarily for correctness, my opinion is we should commit the correctness
patch (makes it one of #1, #3 or #4)

I agree we should not cause further delay to 4.0 release (making it one of
#3 or #4).

Con for #3 would be, applications may have to rework their (and
downstreams') configuration(s) to potentially accommodate for the
performance regression which may not be ideal for a seamless 4.0 upgrade
that we expect users to experience.

Now, given this correctness issue has been since the beginning, existing
LWT users would notice no new difference potentially w.r.t. correctness
since they may have already worked around this bug (if they noticed), so +1
to option #4.

On Wed, Nov 11, 2020 at 1:49 PM Benedict Elliott Smith 
wrote:

> In my opinion, a similar calculus should be applied to 3.0 and 3.11.  This
> is a(n arguably quite serious) bug, so whatever is not overly onerous to
> backport should be considered while they are supported. The work under
> discussion has two components: a replacement to the core consensus
> algorithm, and mechanisms to ensure safety across range movements. The
> latter might be more invasive for 3.x, but the former should be quite easy
> to backport and as such probably quite well justified.
>
> > can it also pluggable (either opt-in or opt-out)?
>
> I think pluggable means something different to opt-in/opt-out, at least to
> me.  I'm all for more pluggability, and also for more optionality, but the
> decision is very sensitive to context. We need to be able to select between
> our options, which for consensus practically means supporting live
> migration - which is exceptionally challenging in any general sense (and
> perhaps inherently non-pluggable).
>
> As to future development for consensus, I personally hope the work we are
> discussing here will be a strong platform for it, but obviously that's for
> the community to decide later on. I think the work to take it forwards to
> something epaxos-like will not be that herculean, with some incremental
> milestones en route. But that's a totally different discussion for the
> future, and either a CEP or a small intercollegiate working group.
>
>
> On 11/11/2020, 18:48, "Michael Semb Wever"  wrote:
>
>
> > Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> > Benedict, Sylvain and I wanted to get the community feedback on them.
> >
> > We can:
> >
> >1. Try to use Benedict proposal for 4.0 if the community has the
> >appetite for it. The main issue there is some potential extra
> delay for 4.0
> >2. Do nothing for 4.0. Meaning do not commit the current patch.
> We have
> >lived a long time with that issue and we can probably wait a bit
> more for a
> >proper solution.
> >3. Commit the patch as such, fixing the correctness but
> introducing
> >potentially some performance issue until we release a better
> solution.
> >4. Changing the patch to default to the current behavior but
> allowing
> >people to enable the new one if the correctness is a problem for
> them.
> >
>
>
> If these options are for 4.0, is it then (4) that it is getting
> applied to 3.0 and 3.11 ?
>
> If that is the case then I would vote on also applying (4) to 4.0,
> given we are now in front of beta4. Please let's not further delay 4.0.
>
> Post 4.0, if (1) is as described "a parallel implementation of the
> same underlying Paxos algorithm" can it also pluggable (either opt-in or
> opt-out)? And would/could EPaxos become pluggable too in a similar manner
> (if it eventuates)? I'm in favour on providing more pluggable interfaces
> into C*, along with the code quality improvements that's going to have to
> be accompanied with.
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>