Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Benedict
It’s worth noting though that a very large engineering effort called “Transactional Cluster Metadata” is already wrapping up that properly addresses these problems, but that will be landing in 5.1 and won’t be suitable for back-porting.On 13 Sep 2024, at 21:32, Caleb Rackliffe wrote:I'd encourage

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Caleb Rackliffe
I'd encourage you to start a new DISCUSS thread around that. On Fri, Sep 13, 2024 at 2:38 PM Jaydeep Chovatia wrote: > > Rejecting/logging the traffic is a significant step forward, but that does > not solve the real problem. It still degrades the workload and requires > manual operator's involv

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Jaydeep Chovatia
Rejecting/logging the traffic is a significant step forward, but that does not solve the real problem. It still degrades the workload and requires manual operator's involvement. How about we also enhance Cassandra to automatically detect and fix the token ownership mismatch between StorageService

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Caleb Rackliffe
If it makes anyone feel better, 2600 of the 3600-lines of this patch are tests (and the rest is minor refactoring of the verb handlers). Anyway, glad to see a ton of participation here. I'll get back into implementation space today, and start dealing with review feedback as it comes in... P.S. I

Re: Welcome Chris Bannister, James Hartig, Jackson Flemming and João Reis, as cassandra-gocql-driver committers

2024-09-13 Thread Benjamin Lerer
Congratulation! Le ven. 13 sept. 2024 à 14:21, Josh McKenzie a écrit : > Congratulations and welcome! It's great to have you all on board! > > On Thu, Sep 12, 2024, at 11:16 PM, guo Maxwell wrote: > > Congratulations! > > James Hartig 于2024年9月13日周五 08:11写道: > > > Thanks everyone! Excited to con

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Brandon Williams
On Thu, Sep 12, 2024 at 8:34 PM Josh McKenzie wrote: > I'm not advocating for us having a rigid principled stance where we reject > all nuance and don't discuss things. I'm advocating for us coalescing on a > shared default stance of correctness unless otherwise excepted. We know we're > a dive

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Mick Semb Wever
reply below. > Mick - this patch doesn't fix things 100%. It can't. BUT - it does take us > from "In all cases where this occurs you will silently lose data" to "in > some cases where this occurs you will have a rejected write, in others > you'll have coordinator level logging, and in the worst

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Josh McKenzie
I think it's worth exploring where the disconnect is just a *bit* more, though I agree with you Benedict in that there appears to be a clear consensus so the thread has evolved more into talking about our principles than what to do in this specific scenario. Mick - this patch doesn't fix things

Re: Welcome Chris Bannister, James Hartig, Jackson Flemming and João Reis, as cassandra-gocql-driver committers

2024-09-13 Thread Josh McKenzie
Congratulations and welcome! It's great to have you all on board! On Thu, Sep 12, 2024, at 11:16 PM, guo Maxwell wrote: > Congratulations! > > James Hartig 于2024年9月13日周五 08:11写道: >> __ >> Thanks everyone! Excited to contribute. >> >> On Thu, Sep 12, 2024, at 4:59 PM, Francisco Guerrero wrote: >

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Benedict
I think everyone has made their case, the costs and benefits are fairly well understood, and there appears to be a strong quorum that favours an approach that prioritises avoiding data loss. So, I propose we either recognise that this is the clear position of the community, or move to a vote to for

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Alex Petrov
I agree with folks saying that we absolutely need to reject misplaced writes. It may not preclude coordinator making a local write, or making a write to a local replica, but even reducing probability of a misplaced write shown as success to the client is a substantial win. On Fri, Sep 13, 2024

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Mick Semb Wever
replies below (to Scott, Josh and Jeremiah). tl;dr all my four points remain undisputed, when the patch is applied. This is a messy situation, but no denying the value of rejection writes to various known popular scenarios. Point (1) remains important to highlight IMHO. On Fri, 13 Sept 2024 at

Re: [DISCUSS] CASSANDRA-13704 Safer handling of out of range tokens

2024-09-13 Thread Berenguer Blasi
+1 to rejecting on all branches. Yes fixing bugs and problems change how things used to worked and some users will be surprised. But it's better than being surprised on an eventual data loss. On 13/9/24 3:34, Josh McKenzie wrote: Even when the fix is only partial, so really it's more about more