Well, I expressed a preference for #3 over #4, particularly for the 3.x series. However at this point, I think the lack of a clear project decision means we can punt it back to you and Sylvain to make the final call.
On 20/11/2020, 16:23, "Benjamin Lerer" <benjamin.le...@datastax.com> wrote: I will try to summarize the discussion to clarify the outcome. Mick is in favor of #4 Summanth is in favor of #4 Sylvain answer was not clear for me. I understood it like I prefer #3 to #4 and I am also fine with #1 Jeff is in favor of #3 and will understand #4 David is in favor #3 (fix bug and add flag to roll back to old behavior) in 4.0 and #4 in 3.0 and 3.11 Do not hesitate to correct me if I misunderstood your answer. Based on these answers it seems clear that most people prefer to go for #3 or #4. The choice between #3 (fix correctness opt-in to current behavior) and #4 (current behavior opt-in to correctness) is a bit less clear specially if we consider the 3.X branches or 4.0. Does anybody as some idea on how to choose between those 2 choices or some extra opinions on #3 versus #4? On Wed, Nov 18, 2020 at 9:45 PM David Capwell <dcapw...@gmail.com> wrote: > I feel that #4 (fix bug and add flag to roll back to old behavior) is best. > > About the alternative implementation, I am fine adding it to 3.x and 4.0, > but should treat it as a different path disabled by default that you can > opt-into, with a plan to opt-in by default "eventually". > > On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith < > bened...@apache.org> > wrote: > > > Perhaps there might be broader appetite to weigh in on which major > > releases we might target for work that fixes the correctness bug without > > serious performance regression? > > > > i.e., if we were to fix the correctness bug now, introducing a serious > > performance regression (either opt-in or opt-out), but were to land work > > without this problem for 5.0, would there be appetite to backport this > work > > to any of 4.0, 3.11 or 3.0? > > > > > > On 18/11/2020, 18:31, "Jeff Jirsa" <jji...@gmail.com> wrote: > > > > This is complicated and relatively few people on earth understand it, > > so > > having little feedback is mostly expected, unfortunately. > > > > My normal emotional response is "correctness is required, opt-in to > > performance improvements that sacrifice strict correctness", but I'm > > also > > sure this is going to surprise people, and would understand / accept > #4 > > (default to current, opt-in to correct). > > > > > > On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith < > > bened...@apache.org> > > wrote: > > > > > It doesn't seem like there's much enthusiasm for any of the options > > > available here... > > > > > > On 12/11/2020, 14:37, "Benedict Elliott Smith" < > bened...@apache.org > > > > > > wrote: > > > > > > > Is the new implementation a separate, distinctly modularized > > new > > > body of work > > > > > > It’s primarily a distinct, modularised and new body of work, > > however > > > there is some shared code that has been modified - namely > > PaxosState, in > > > which legacy code is maintained but modified for compatibility, and > > the > > > system.paxos table (which receives a new column, and slightly > > modified > > > serialization code). It is conceptually an optimised version of > the > > > existing algorithm. > > > > > > If there's a chance of being of value to 4.0, I can try to put > > up a > > > patch next week alongside a high level description of the changes. > > > > > > > But a performance regression is a regression, I'm not > > shrugging it > > > off. > > > > > > I don't want to give the impression I'm shrugging off the > > correctness > > > issue either. It's a serious issue to fix, but since all successful > > updates > > > to the database are linearizable, I think it's likely that many > > > applications behave correctly with the present semantics, or at > least > > > encounter only transient errors. No doubt many also do not, but I > > have no > > > idea of the ratio. > > > > > > The regression isn't itself a simple issue either - depending > on > > the > > > topology and message latencies it is not difficult to produce > > inescapable > > > contention, i.e. guaranteed timeouts - that might persist as long > as > > > clients continue to retry. It could be quite a serious degradation > of > > > service to impose on our users. > > > > > > I don't pretend to know the correct way to make a decision > > balancing > > > these considerations, but I am perhaps more concerned about > imposing > > > service outages than I am temporarily maintaining semantics our > > users have > > > apparently accepted for years - though I absolutely share your > > > embarrassment there. > > > > > > > > > On 12/11/2020, 12:41, "Joshua McKenzie" <jmcken...@apache.org > > > > wrote: > > > > > > Is the new implementation a separate, distinctly > modularized > > new > > > body of > > > work or does it make substantial changes to existing > > > implementation and > > > subsume it? > > > > > > On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne < > > > lebre...@gmail.com> wrote: > > > > > > > Regarding option #4, I'll remark that experience tends to > > > suggest users > > > > don't consistently read the `NEWS.txt` file on upgrade, > so > > > option #4 will > > > > likely essentially mean "LWT has a correctness issue, but > > once > > > it broke > > > > your data enough that you'll notice, you'll be able to > dig > > the > > > proper flag > > > > to fix it for next time". I guess it's better than > > nothing, of > > > course, but > > > > I'll admit that defaulting to "opt-in correctness", > > especially > > > for a > > > > feature (LWT) that exists uniquely to provide additional > > > guarantees, is > > > > something I have a hard rallying behind. > > > > > > > > But a performance regression is a regression, I'm not > > shrugging > > > it off. > > > > Still, I feel we shouldn't leave LWT with a fairly > serious > > known > > > > correctness bug and I frankly feel bad for "the project" > > that > > > this has been > > > > known for so long without action, so I'm a bit biased in > > wanting > > > to get it > > > > fixed asap. > > > > > > > > But maybe I'm overstating the urgency here, and maybe > > option #1 > > > is a better > > > > way forward. > > > > > > > > -- > > > > Sylvain > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org