[ 
https://issues.apache.org/jira/browse/CASSANDRA-21214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065423#comment-18065423
 ] 

Branimir Lambov commented on CASSANDRA-21214:
---------------------------------------------

Added a test demonstrating the problem in [this 
branch|https://github.com/blambov/cassandra/tree/CASSANDRA-21214-test].

> Incremental repairs cannot make progress on busy node
> -----------------------------------------------------
>
>                 Key: CASSANDRA-21214
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21214
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair, Local/Compaction
>            Reporter: Branimir Lambov
>            Priority: Normal
>
> On a loaded high density node, incremental repair can end up in a state where 
> it cannot make any progress because it cannot get the exclusive access to 
> sstables it needs.
> The reason for this is the fact that we can have compaction tasks that have 
> been created (and thus have created a transaction over their files and marked 
> them as compacting) but have not started or reached the point where they 
> register with the active operations tracker. This phase can last pretty long, 
> especially if they are waiting for a thread to run.
> As a result, when `runWithCompactionsDisabled` tries to cancel ongoing 
> operations, it cannot see these scheduled but not active tasks. If the stop 
> requested applies to all operations, this would eventually free up threads 
> for all tasks, but incremental repair only wants to stop tasks intersecting 
> its range in the unrepaired arena, which means that the compaction threads 
> can remain busy doing unrelated work for hours after the request is made, and 
> thus the scheduled tasks do not have a chance to be executed during the 
> cancellation period.
> This manifests as incremental repair tasks consistently failing because they 
> cannot perform the initial anticompaction step.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to