[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica

Szymon Miezal (Jira) Wed, 20 Dec 2023 08:23:07 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799089#comment-17799089
 ]


Szymon Miezal commented on CASSANDRA-18824:
-------------------------------------------

I have prepared the following patches:
 * 3.0 - 
[https://github.com/szymon-miezal/cassandra/commit/68625daa5f55dbb0873ac4603fa05f47853cbeff]
 this one also has a PR - [https://github.com/apache/cassandra/pull/2921] 
 * 3.11 - 
[https://github.com/szymon-miezal/cassandra/commit/f21777525862b8da3345e21faac40a16631b8194]
 * 4.0 - 
[https://github.com/szymon-miezal/cassandra/commit/9652bf53d09d66609356f2110ee9110f6c8d9eb2]
 * 4.1 - 
[https://github.com/szymon-miezal/cassandra/commit/51624a811449b988d16efd396187e4825b0cc5ce]
 * 5.0 - 
[https://github.com/szymon-miezal/cassandra/commit/c7dd3bfad97b18d834f89a04a3076f9e8f9a353c]
 * trunk - 
[https://github.com/szymon-miezal/cassandra/commit/096805afc658e80a8265bae3911ad3834331d325]
 (this patch intentionally contains only the test refactoring)

The patches differ between major versions I suspect merging will require a bit 
of effort.

> Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused 
> missing replica
> -------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18824
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18824
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Bootstrap and Decommission
>            Reporter: Szymon Miezal
>            Assignee: Szymon Miezal
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> Node decommission triggers data transfer to other nodes. While this transfer 
> is in progress,
> receiving nodes temporarily hold token ranges in a pending state. However, 
> the cleanup process currently doesn't consider these pending ranges when 
> calculating token ownership.
> As a consequence, data that is already stored in sstables gets inadvertently 
> cleaned up.
> STR:
>  * Create two node cluster
>  * Create keyspace with RF=1
>  * Insert sample data (assert data is available when querying both nodes)
>  * Start decommission process of node 1
>  * Start running cleanup in a loop on node 2 until decommission on node 1 
> finishes
>  * Verify of all rows are in the cluster - it will fail as the previous step 
> removed some of the rows
> It seems that the cleanup process does not take into account the pending 
> ranges, it uses only the local ranges - 
> [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466].
> There are two solutions to the problem.
> One would be to change the cleanup process in a way that it start taking 
> pending ranges into account. Even thought it might sound tempting at first it 
> will require involving changes and a lot of testing effort.
> Alternatively we could interrupt/prevent the cleanup process from running 
> when any pending range on a node is detected. That sounds like a reasonable 
> alternative to the problem and something that is relatively easy to implement.
> The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this 
> ticket is to backport it to 3.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica

Reply via email to