Re: [UPDATE] CEP-37

Jaydeep Chovatia Wed, 23 Apr 2025 12:48:13 -0700

The CEP-37 work has been successfully merged into the trunk today! Please
let me know if you have any issues.


This merge is a massive win for Apache Cassandra — a significant step
forward. But we're not stopping here. There's more to come, and we are
committed to pushing repair automation even further and closing the gaps in
the remaining flows. A few examples:

   1. Automatically running repair as part of the node replacement: Design
   
<https://docs.google.com/document/d/1SZIQPbIWNDsbWnIk5N5tyQCQzJ4ypwuhH-t5dO5WeZs/edit?tab=t.0>
   & POC <https://github.com/jaydeepkumar1984/cassandra/pull/54> is already
   out [CASSANDRA-20281
   <https://issues.apache.org/jira/browse/CASSANDRA-20281>]
   2. Stopping repair automatically between Cassandra major version
   upgrades [CASSANDRA-20048
   <https://issues.apache.org/jira/browse/CASSANDRA-20048>]
   3. Repairing automatically when Keyspace replication changes [
   CASSANDRA-20582 <https://issues.apache.org/jira/browse/CASSANDRA-20582>]


Thanks for all the help and support from the Apache Cassandra community!

Yours sincerely,
Andy Tolbert, Chris Lohfink, Francisco Guerrero, Kristijonas Zalys, and
Jaydeep

On Sun, Mar 9, 2025 at 8:53 PM Jaydeep Chovatia <chovatia.jayd...@gmail.com>
wrote:

> Thanks a lot, Jon!
> This has truly been a team effort, with Andy Tolbert, Chris Lohfink,
> Francisco Guerrero, and Kristijonas Zalys all contributing over the past
> year. The credit belongs to everyone!
>
> Jaydeep
>
>
>
>
>
> On Sun, Mar 9, 2025 at 2:35 PM Jon Haddad <j...@rustyrazorblade.com> wrote:
>
>> This is all really exciting.  Getting a built in, orchestrated repair is
>> a massive achievement.  Thank you for your work on this, it's incredibly
>> valuable to the community!!
>>
>> Jon
>>
>> On Sun, Mar 9, 2025 at 2:25 PM Jaydeep Chovatia <
>> chovatia.jayd...@gmail.com> wrote:
>>
>>> No problem, Dave! Thank you.
>>>
>>> Jaydeep
>>>
>>> On Sun, Mar 9, 2025 at 10:46 AM Dave Herrington <he...@rhinosource.com>
>>> wrote:
>>>
>>>> Jaydeep,
>>>>
>>>> Thank you for taking time to answer my questions and for the links to
>>>> the design and overview docs, which are excellent and answer all of my
>>>> remaining questions.  Sorry I missed those links in the CEP page.
>>>>
>>>> Great work and I will continue to follow your progress on this powerful
>>>> new feature.
>>>>
>>>> Thanks!
>>>> -Dave
>>>>
>>>> On Sat, Mar 8, 2025 at 9:36 AM Jaydeep Chovatia <
>>>> chovatia.jayd...@gmail.com> wrote:
>>>>
>>>>> Hi David,
>>>>>
>>>>> Thanks for the kind words!
>>>>>
>>>>> >Is there a goal in this CEP to make automated repair work during
>>>>> rolling upgrades, when multiple versions exist in the cluster?
>>>>> We debated a lot on this over ASF Slack
>>>>> (#cassandra-repair-scheduling-cep37). The summary is that, ideally, we 
>>>>> want
>>>>> to have a repair function during the mixed version, but the reality is 
>>>>> that
>>>>> currently, there is no test suite available inside Apache Cassandra to
>>>>> verify the streaming behavior during the mixed version, so the confidence
>>>>> is low.
>>>>> We agreed on the following: 1) Keeping safety in mind, we should by
>>>>> default disable the repair during mixed version 2) Add a comprehensive 
>>>>> test
>>>>> suite 3) Allow repair during mixed version. Currently, we are at #1
>>>>>
>>>>> >Would automated repair be smart enough to automatically stop, if it
>>>>> sees incompatible versions?
>>>>> That's the plan, and we already have PR (CASSANDRA-20048
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-20048>) out from
>>>>> Chris Lohfink. The thing we are debating is whether to stop only during
>>>>> major version mismatch or also during the minor version, and we are 
>>>>> leaning
>>>>> towards only disabling for the major version mismatch. Regardless, this
>>>>> should be available soon.
>>>>> We are also extending this further as per feedback from David
>>>>> Capwell that we should automatically stop repair if we detect a new DC or
>>>>> keyspace RF is changed. That will be covered later as part of
>>>>> CASSANDRA-20414
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-20414>
>>>>>
>>>>> >If automated repair must be disabled for the entire cluster, will
>>>>> this be a single nodetool command, or must automated repair be disabled on
>>>>> each node individually?
>>>>> Yes, it is a nodetool command and does not require any restarts! All
>>>>> the *nodetool* command details are currently covered in the design doc
>>>>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit?tab=t.0#heading=h.89fmsespiosd>,
>>>>> and the same details will also be available in the Cassandra
>>>>> overview.adoc
>>>>> <https://github.com/apache/cassandra/pull/3598/files?short_path=e901018#diff-e90101885c1188844bb4188d1301277bfdc4a9e1e705c4ab8a6cc5a4b44460c0>
>>>>> .
>>>>>
>>>>> >Would it make sense for automated repair to upgrade sstables, if it
>>>>> finds old formats? (Maybe this could be a feature that could be optionally
>>>>> enabled?)
>>>>> My opinion is that it should not be part of the repair. It is best
>>>>> suited as part of the Cassandra upgrade framework; I guess Paulo M is
>>>>> looking at it.
>>>>>
>>>>> >W.R.T. the repair logging tables in the system_distributed keyspace,
>>>>> will these tables have a configurable TTL, or must they be periodically
>>>>> truncated to limit their size?
>>>>> The number of entries will equal the number of Cassandra nodes in a
>>>>> cluster. There is no TTL because each row represents the repair status of
>>>>> that particular node. The entries would be automatically added/removed as
>>>>> nodes are added/removed from the Cassandra cluster.
>>>>>
>>>>> Jaydeep
>>>>>
>>>>> On Sat, Mar 8, 2025 at 7:46 AM Dave Herrington <he...@rhinosource.com>
>>>>> wrote:
>>>>>
>>>>>> Jaydeep,
>>>>>>
>>>>>> Thank you for your excellent efforts on this mission-critical
>>>>>> feature.  The stated goals of CEP-37 are noble and stand to make valuable
>>>>>> improvements for cluster operations.  I look forward to testing these new
>>>>>> capabilities.
>>>>>>
>>>>>> My apologies up-front if you’ve already answered these questions.  I
>>>>>> did read the CEP a number of times and the linked JIRAs, but these are my
>>>>>> questions that I couldn’t answer myself.
>>>>>>
>>>>>> I’m interested to understand the goals of CEP-37 W.R.T. to rolling
>>>>>> upgrades of large clusters, as I am responsible for maintaining the 
>>>>>> cluster
>>>>>> operations runbooks for a number of customers.
>>>>>>
>>>>>> Operators have to navigate the upgrade gauntlet with automated
>>>>>> repairs disabled and get all nodes upgraded within gc_grace_seconds and
>>>>>> then do a full repair, before restarting automated repairs.
>>>>>>
>>>>>> I see that CASSANDRA-7530
>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-7530 is related to
>>>>>> this.
>>>>>>
>>>>>> Is there a goal in this CEP to make automated repair work during
>>>>>> rolling upgrades, when multiple versions exist in the cluster?
>>>>>>
>>>>>> (I think this would imply that stopping automated repairs would no
>>>>>> longer be a pre-upgrade step.)
>>>>>>
>>>>>> Would automated repair be smart enough to automatically stop, if it
>>>>>> sees incompatible versions?
>>>>>>
>>>>>> Would automated repair continue between nodes with compatible
>>>>>> versions, or would it stop for the entire cluster?
>>>>>>
>>>>>> If automated repair must be disabled for the entire cluster, will
>>>>>> this be a single nodetool command, or must automated repair be disabled 
>>>>>> on
>>>>>> each node individually?
>>>>>>
>>>>>> Would it make sense for automated repair to upgrade sstables, if it
>>>>>> finds old formats? (Maybe this could be a feature that could be 
>>>>>> optionally
>>>>>> enabled?)
>>>>>>
>>>>>> W.R.T. the repair logging tables in the system_distributed keyspace,
>>>>>> will these tables have a configurable TTL, or must they be periodically
>>>>>> truncated to limit their size?
>>>>>>
>>>>>> Thanks,
>>>>>> -Dave
>>>>>>
>>>>>> David A. Herrington II
>>>>>> President and Chief Engineer
>>>>>> RhinoSource, Inc.
>>>>>>
>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>
>>>>>> www.rhinosource.com
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 7, 2025 at 11:48 AM Jaydeep Chovatia <
>>>>>> chovatia.jayd...@gmail.com> wrote:
>>>>>>
>>>>>>> Hello Everyone,
>>>>>>>
>>>>>>> I wanted to update you on CEP-37
>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution>
>>>>>>>  (Jira:
>>>>>>> CASSANDRA-19918
>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-19918>) work.
>>>>>>> Over the last year, some of us (Andy Tolbert, Chris Lohfink,
>>>>>>> Francisco Guerrero, and Kristijonas Zalys) have been working closely on
>>>>>>> making CEP-37 rock solid, with support from Josh McKenzie, Dinesh Joshi,
>>>>>>> and David Capwell.
>>>>>>> First and foremost, a huge thank you to everyone, including the
>>>>>>> broader Apache Cassandra community, for their invaluable contributions 
>>>>>>> in
>>>>>>> making CEP-37 robust and solid!
>>>>>>>
>>>>>>> Here is the current status:
>>>>>>>
>>>>>>> *Feature stability*
>>>>>>>
>>>>>>>    - *Voted feature:* All the features mentioned in CEP-37 have
>>>>>>>    worked as expected.
>>>>>>>    - *Post-voted feature:* A few new minor improvements
>>>>>>>    
>>>>>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=272927365#CEP37ApacheCassandraUnifiedRepairSolution-Post-VoteUpdates>
>>>>>>>    have been added to post-voting, and they are also working as 
>>>>>>> expected.
>>>>>>>    - Tested the functionality by multiple people over the period of
>>>>>>>    time.
>>>>>>>    - Some other facts: it has already been validated at scale
>>>>>>>    <https://www.youtube.com/watch?v=xFicEj6Nhq8>. Another big
>>>>>>>    Cassandra use case is in the process of validating/adopting it in 
>>>>>>> their
>>>>>>>    environment.
>>>>>>>
>>>>>>> *Source Code*
>>>>>>>
>>>>>>>    - It is an opt-in feature; nobody notices anything unless
>>>>>>>    someone opts in.
>>>>>>>    - By default, this feature is pretty isolated (in a separate
>>>>>>>    package) from the source code point of view (94% of the source code
>>>>>>>    lines are in the new files)
>>>>>>>    - A thorough documentation has been added:
>>>>>>>       - overview.doc
>>>>>>>       - metrics.doc
>>>>>>>       - cassandra.yaml doc
>>>>>>>       - NEWS.txt overview
>>>>>>>    - Five people (Andy Tolbert, Chris Lohfink, Francisco Guerrero,
>>>>>>>    and Kristijonas Zalys) have contributed.
>>>>>>>    - The source code has been reviewed multiple times by the same
>>>>>>>    five people.
>>>>>>>
>>>>>>> *Test Coverage*
>>>>>>>
>>>>>>>    - A comprehensive test coverage has been added to cover all
>>>>>>>    aspects.
>>>>>>>    - The entire test suite has been passing
>>>>>>>
>>>>>>>
>>>>>>> We are in the final review phase and nearly ready to merge. If
>>>>>>> anyone has any last-minute feedback, this is the final opportunity for
>>>>>>> review.
>>>>>>>
>>>>>>> Thank you!
>>>>>>> Andy Tolbert, Chris Lohfink, Francisco Guerrero, Kristijonas Zalys,
>>>>>>> and Jaydeep
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> -Dave
>>>>
>>>> David A. Herrington II
>>>> President and Chief Engineer
>>>> RhinoSource, Inc.
>>>>
>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>
>>>> www.rhinosource.com
>>>>
>>>

Re: [UPDATE] CEP-37

Reply via email to