Re: Welcome Bret McGuire as Cassandra PMC Member!

2025-05-19 Thread Dmitry Konstantinov
Congratulations Bret!

On Mon, 19 May 2025 at 00:51, Bret McGuire  wrote:

>   Thank you everyone for the kind words!
>
>- Bret -
>
> On Sat, May 17, 2025 at 11:44 PM Patrick McFadin 
> wrote:
>
>> Welcome Bret and congratulations!
>>
>> On Sat, May 17, 2025 at 10:46 AM Jaydeep Chovatia <
>> chovatia.jayd...@gmail.com> wrote:
>>
>>> Congratulations!
>>>
>>> Jaydeep
>>>
>>> On May 16, 2025, at 6:47 PM, guo Maxwell  wrote:
>>>
>>> 
>>> Congrats!!
>>>
>>> Jon Haddad  于2025年5月17日周六 08:10写道:
>>>
 Congrats!!


 On Fri, May 16, 2025 at 4:13 PM Alexandre DUTRA 
 wrote:

> Great news, and so well deserved! Congratulations Bret!
>
> Le ven. 16 mai 2025 à 19:00, Jeremiah Jordan  a
> écrit :
>
>> Congrats Bret!
>>
>> On May 16, 2025 at 4:00:12 PM, Mick Semb Wever 
>> wrote:
>>
>>>
>>> The Project Management Committee (PMC) for Apache Cassandra is
>>> delighted to announce that Bret McGuire has joined the PMC!  Thank
>>> you Bret for all your contributions to the project over the years,
>>> especially with so many of the drivers.
>>>
>>> The PMC - Project Management Committee - manages and guides the
>>> direction of the project, and is responsible for inviting new 
>>> committers and
>>>  PMC members to steward the longevity of the project.  See
>>> https://community.apache.org/pmc/responsibilities.html if you're
>>> interested in learning more about the rights and responsibilities of
>>>  PMC members.
>>>
>>> Please join us in welcoming Bret McGuire to his new role in our
>>> project!
>>>
>>> Mick, on behalf of the Apache Cassandra PMC
>>>
>>

-- 
Dmitry Konstantinov


Re: FixVersion house cleaning

2025-05-19 Thread Josh McKenzie
If we bump everything to 6.x including bugs, people can lazily evaluate when 
they work on them as to how far back the bugfix should apply. So either way 
(over-rotate and proactively flag too many for 5.0.x, just bulk them all to 6.x 
and add back on completion), we should theoretically end up with the same final 
outcome.

I think the straight bump to 6.x is probably better since having missing 
information is less misleading than having *bad* version targeting information.

On Sat, May 17, 2025, at 10:30 AM, Mick Semb Wever wrote:
>  .
>   
> On Sat, 17 May 2025 at 14:57, Josh McKenzie  wrote:
>> __
>> With the dropping of .MINOR in semver simplifying some things in our release 
>> we have some FixVersion updating to consider.
>> 
>> For those that might not know - we use the ".X" FixVersion to indicate 
>> something is intended for a specific release line, then resolve the "X" to 
>> the number of the release it's merged into. For example, if the current 
>> major release is 5.0.4, a ticket intended for that line would have 
>> FixVersion "5.0.x", then on merge, would get the next unreleased version in 
>> that line "5.0.5".
>> 
>> I searched on our wiki and don't see this documented actually; might be good 
>> to document this on our Release Versioning wiki article 
>> .
> 
> 
> 
> Currently this is only documented on the jira version descriptions.  To wiki, 
> yes please!
> 
> 
>  
>> 
>> With the move away from .MINOR, we need to figure out what we want to do 
>> with the "5.x" FixVersion tickets. Here's a breakdown of count by type 
>> w/links to JQL queries:
>>  • All 5.x tickets: 761 
>> 
>>  • New Feature: 91 
>> 
>>  • Improvement: 326 
>> 
>>  • Bug + Task: 304 
>> 
>>  (233 
>> 
>>  are 5.x only w/out 5.0.x; probably need to update these...)
>> My initial thought on how we tackle this:
>>  1. Replace "5.x" in FixVersions with "6.x"
>>  2. Add "5.0.x" to the 233 bug/task targeting "5.x" (w/the understanding 
>> some of those will actually be 6.x intended only)
>> Thoughts?
> 
> 
> 
> If things are correct, according to (the previous) plan it should just be 
> about moving all "5.x" tickets to "6.x" (and deleting "5.x).  And also 
> renaming "5.1" to "6.0".  But I'm not sure about adding "5.0.x"  without any 
> inspection to whether  a) we want the fix in 5.0.x and  b) it's feasible to 
> fix it in a patch version.
> 
> So
> 1) yes
> 2) no
> 
> I'm sure there's plenty of 5.x bugs that should be 5.0.x, but I don't see 
> this a big concern (it gets addressed when it's worked on), and there's 
> plenty of tickets (786!) that are reported and remain without any fixVersion 
> at all.   I of course don't have any objection to any committer triaging them 
> as appropriate for 5.0.x (and any earlier fixVersion), and updating them so.
> 


Re: One Board to Rule Them All (or: ecosystem JIRA's are now integrated in our kanban board)

2025-05-19 Thread Josh McKenzie
> It looks like the swimline for "Patch 5.0.x" is missing.
Ugh. Thanks for pointing that out. Should be fixed now.

The swimlane was configured but tickets were getting caught up in an earlier 
lane. We have to decide where we want a ticket to live that has, for example, 
FixVersions (4.1.x, 5.0.x, 6.x). For now, **I've configured things so tickets 
will show up in the oldest release version to which they apply**, with the 
understanding that we merge all work forward to trunk unless otherwise 
specified.

A couple things that stand out from this filtering:
 • Next major is heavy at 608 issues
 • Patch 5.0.x has 119 issues
 • Patch 3.x still has 251 issues; we should triage these to see what needs to 
be dropped and pulled forward
 • "Missing Version" has 1845 issues; worth it to triage these I think.
I've added the following quick filters up top as well to focus the board on 
certain workflows:
 • Core Cassandra: limits to PROJECT = CASSANDRA only
 • EOL FixVersion: limits to tickets w/fixversion targeting out of support C* 
versions (251 mentioned above, 8 tickets target 5.0.x have an EOL fixversion as 
well)
 • Missing Version: tickets w/out versions. 1845 in Core Cassandra, dozens in 
various ecosystem projects

On Sat, May 17, 2025, at 10:35 AM, Mick Semb Wever wrote:
>   .
> 
> On Fri, 16 May 2025 at 20:53, Josh McKenzie  wrote:
>> __
>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484
>> 
> 
> 
> 
> Nice, thank you Josh.
> 
> It looks like the swimline for "Patch 5.0.x" is missing.
> 


Re: Stricter repair time requirements necessary to prevent data resurrection than advised by docs

2025-05-19 Thread Mike Sun
>
> To simplify operations, the newly introduced in-built AutoRepair feature
>> in Cassandra (as part of CEP-37
>> )
>> includes intelligent behavior that tracks the oldest repaired node in the
>> cluster and prioritizes it for repair. It also emits a range of metrics to
>> assist operators. One key metric, LongestUnrepairedSec
>> ,
>> indicates how long it has been since the last repair for any part of the
>> data. Operators can create an alarm on the metric if it becomes higher than
>> the *gc_grace_seconds*.
>
>
> This is great to hear! Thanks for pointing me to that Jaydeep. It will
definitely make it easier for operators to monitor and alarm on potential
expiring tombstone risks. I will update my post to include this upcoming
feature.

Best,
Mike Sun

On Sat, May 17, 2025 at 12:54 PM Mike Sun  wrote:
>
>> Jeremiah, you’re right, I’ve been using “repair” to mean a cluster-level
>> repair as opposed to a single “nodetool repair” operation, and the
>> Cassandra docs mean “nodetool repair” when referring to a repair. Thanks
>> for pointing that out! I agree that the recommendation to run a “nodetool
>> repair” on every node or token range every 7 days with a gc_grace_seconds =
>> 10 days should practically prevent data resurrection.
>>
>> I still think theoretically though, starting and completing each nodetool
>> repair operation within gc_grace_seconds won't absolutely guarantee that
>> there’s no chance of an expired tombstone. nodetool repair operations on
>> the same node+token range(s) don't always take the same amount of time to
>> run and therefore don’t guarantee that specific tokens are always repaired
>> at the same elapsed time.
>>
>> e.g. if gc_grace_seconds=10 hours, nodetool repair is run every 7 hours,
>> nodetool repair operations can take between 2 to 5 hours
>>
>>- 00:00 - nodetool repair 1 starts on node A
>>- 00:30 - nodetool repair 1 repairs token T
>>- 01:00 - token T is deleted
>>- 02:00 - nodetool repair 1 completes
>>- 07:00 - nodetool repair 2 starts on node A
>>- 11:00 - tombstone for token T expires
>>- 11:30 - nodetool repair 2 repairs token T
>>- 12:00 - nodetool repair completes
>>
>> In reality, I agree this is very unlikely to happen. But if we’re looking
>> to establish a rigorous requirement that prevents any chance of data
>> resurrection, then I believe it’s the invariant I proposed for
>> “cluster-level repairs”—that two consecutive complete repairs must succeed
>> within gc_grace_seconds. Theoretical risk of data resurrection is something
>> that keeps me up at night! :).
>>
>> More practically, in my experience with Cassandra and Scylla clusters, I
>> think most operators reason about repairs as “cluster-level” as opposed to
>> individual “nodetool repair” operations, especially due to the use of
>> Reaper for Cassandra and Scylla Manager. Reaper and Scylla Manager repairs
>> jobs are cluster-level and repair admin+monitoring is generally at the
>> cluster-level, e.g. cluster-level repair schedules, durations,
>> success/completions.
>>
>> Repairs managed by Reaper and Scylla Manager do not guarantee a
>> deterministic ordering or timing of individual nodetool repair operations
>> they manage between separate cycles, breaking the "you are performing the
>> cycles in the same order around the nodes every time” assumption. That’s
>> the context from which my original cluster-level repair example comes from.
>>
>> Thanks for the helpful discussion, I will update my blog post to reflect
>> the helpful clarifications!
>>
>> On Fri, May 16, 2025 at 5:25 PM Jeremiah Jordan 
>> wrote:
>>
>>> I agree we need to do a better job and wording this so people can
>>> understand what is happening.
>>>
>>> For your exact example here, you are actually looking at too broad of a
>>> thing.  The exact requirements are not at the full cluster level, but
>>> actually at the “token range” level at which repair operates, a given token
>>> range needs to have repair start and complete within the gc_grace sliding
>>> window.  For your example of a repair cycle that takes 5 days, and is
>>> started every 7 days, assuming you are performing that cycles in the same
>>> order around the nodes every time, a given node will have been repaired
>>> within 7 days, even though the start of repair 1 to the finish of repair 2
>>> was more than 7 days.  The start of “token ranges repaired on day 0” to the
>>> finish of “token ranges repaired on day 7” is less than the gc_grace window.
>>>
>>> -Jeremiah Jordan
>>>
>>> On May 16, 2025 at 2:03:00 PM, Mike Sun  wrote:
>>>
 The wording is subtle and can be confusing...

 It's important to distinguish between:
 1. "You need to start and complete a repair within any gc_grace_sec

Re: [VOTE] Release Apache Cassandra 4.1.9

2025-05-19 Thread Brandon Williams
+1

Kind Regards,
Brandon

On Fri, May 16, 2025 at 7:32 AM Brandon Williams  wrote:
>
> Proposing the test build of Cassandra 4.1.9 for release.
>
> sha1: 82fc35b0136cc5f706032759d73ac5ae02c20871
> Git: https://github.com/apache/cassandra/tree/4.1.9-tentative
> Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1393/org/apache/cassandra/cassandra-all/4.1.9/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.1.9/
>
> The vote will be open for 72 hours (longer if needed). Everyone who
> has tested the build is invited to vote. Votes by PMC members are
> considered binding. A vote passes if there are at least three binding
> +1s and no -1's.
>
> [1]: CHANGES.txt:
> https://github.com/apache/cassandra/blob/4.1.9-tentative/CHANGES.txt
> [2]: NEWS.txt: 
> https://github.com/apache/cassandra/blob/4.1.9-tentative/NEWS.txt
>
>
> Kind Regards,
> Brandon


Re: Welcome Bret McGuire as Cassandra PMC Member!

2025-05-19 Thread David Capwell
Congrats!

> On May 19, 2025, at 4:13 AM, Dmitry Konstantinov  wrote:
> 
> Congratulations Bret!
> 
> On Mon, 19 May 2025 at 00:51, Bret McGuire  > wrote:
>>   Thank you everyone for the kind words!
>> 
>>- Bret -
>> 
>> On Sat, May 17, 2025 at 11:44 PM Patrick McFadin > > wrote:
>>> Welcome Bret and congratulations!
>>> 
>>> On Sat, May 17, 2025 at 10:46 AM Jaydeep Chovatia 
>>> mailto:chovatia.jayd...@gmail.com>> wrote:
 Congratulations!
 
 Jaydeep
 
> On May 16, 2025, at 6:47 PM, guo Maxwell  > wrote:
> 
> 
> Congrats!!
> 
> Jon Haddad mailto:j...@rustyrazorblade.com>> 
> 于2025年5月17日周六 08:10写道:
>> Congrats!!
>> 
>> 
>> On Fri, May 16, 2025 at 4:13 PM Alexandre DUTRA > > wrote:
>>> Great news, and so well deserved! Congratulations Bret!
>>> 
>>> Le ven. 16 mai 2025 à 19:00, Jeremiah Jordan >> > a écrit :
 Congrats Bret!
 
 On May 16, 2025 at 4:00:12 PM, Mick Semb Wever >>> > wrote:
> 
> The Project Management Committee (PMC) for Apache Cassandra is 
> delighted to announce that Bret McGuire has joined the PMC!  Thank 
> you Bret for all your contributions to the project over the years, 
> especially with so many of the drivers.
> 
> The PMC - Project Management Committee - manages and guides the 
> direction of the project, and is responsible for inviting new 
> committers and PMC members to steward the longevity of the project.  
> See https://community.apache.org/pmc/responsibilities.html if you're 
> interested in learning more about the rights and responsibilities of 
> PMC members.
> 
> Please join us in welcoming Bret McGuire to his new role in our 
> project!
> 
> Mick, on behalf of the Apache Cassandra PMC
> 
> 
> 
> --
> Dmitry Konstantinov



Re: [VOTE] Release Apache Cassandra 4.1.9

2025-05-19 Thread Michael Shuler

+1

On 5/16/25 06:32, Brandon Williams wrote:

Proposing the test build of Cassandra 4.1.9 for release.

sha1: 82fc35b0136cc5f706032759d73ac5ae02c20871
Git: https://github.com/apache/cassandra/tree/4.1.9-tentative
Maven Artifacts:
https://repository.apache.org/content/repositories/orgapachecassandra-1393/org/apache/cassandra/cassandra-all/4.1.9/

The Source and Build Artifacts, and the Debian and RPM packages and
repositories, are available here:
https://dist.apache.org/repos/dist/dev/cassandra/4.1.9/

The vote will be open for 72 hours (longer if needed). Everyone who
has tested the build is invited to vote. Votes by PMC members are
considered binding. A vote passes if there are at least three binding
+1s and no -1's.

[1]: CHANGES.txt:
https://github.com/apache/cassandra/blob/4.1.9-tentative/CHANGES.txt
[2]: NEWS.txt: https://github.com/apache/cassandra/blob/4.1.9-tentative/NEWS.txt


Kind Regards,
Brandon




Re: Stricter repair time requirements necessary to prevent data resurrection than advised by docs

2025-05-19 Thread Mike Sun
Thanks everyone for your helpful feedback! I've updated my blog post to
hopefully reflect these clarifications:
https://msun.io/cassandra-scylla-repairs/


On Mon, May 19, 2025 at 9:27 AM Mike Sun  wrote:

> To simplify operations, the newly introduced in-built AutoRepair feature
>>> in Cassandra (as part of CEP-37
>>> )
>>> includes intelligent behavior that tracks the oldest repaired node in the
>>> cluster and prioritizes it for repair. It also emits a range of metrics to
>>> assist operators. One key metric, LongestUnrepairedSec
>>> ,
>>> indicates how long it has been since the last repair for any part of the
>>> data. Operators can create an alarm on the metric if it becomes higher than
>>> the *gc_grace_seconds*.
>>
>>
>> This is great to hear! Thanks for pointing me to that Jaydeep. It will
> definitely make it easier for operators to monitor and alarm on potential
> expiring tombstone risks. I will update my post to include this upcoming
> feature.
>
> Best,
> Mike Sun
>
> On Sat, May 17, 2025 at 12:54 PM Mike Sun  wrote:
>>
>>> Jeremiah, you’re right, I’ve been using “repair” to mean a cluster-level
>>> repair as opposed to a single “nodetool repair” operation, and the
>>> Cassandra docs mean “nodetool repair” when referring to a repair. Thanks
>>> for pointing that out! I agree that the recommendation to run a “nodetool
>>> repair” on every node or token range every 7 days with a gc_grace_seconds =
>>> 10 days should practically prevent data resurrection.
>>>
>>> I still think theoretically though, starting and completing each
>>> nodetool repair operation within gc_grace_seconds won't absolutely
>>> guarantee that there’s no chance of an expired tombstone. nodetool repair
>>> operations on the same node+token range(s) don't always take the same
>>> amount of time to run and therefore don’t guarantee that specific tokens
>>> are always repaired at the same elapsed time.
>>>
>>> e.g. if gc_grace_seconds=10 hours, nodetool repair is run every 7 hours,
>>> nodetool repair operations can take between 2 to 5 hours
>>>
>>>- 00:00 - nodetool repair 1 starts on node A
>>>- 00:30 - nodetool repair 1 repairs token T
>>>- 01:00 - token T is deleted
>>>- 02:00 - nodetool repair 1 completes
>>>- 07:00 - nodetool repair 2 starts on node A
>>>- 11:00 - tombstone for token T expires
>>>- 11:30 - nodetool repair 2 repairs token T
>>>- 12:00 - nodetool repair completes
>>>
>>> In reality, I agree this is very unlikely to happen. But if we’re
>>> looking to establish a rigorous requirement that prevents any chance of
>>> data resurrection, then I believe it’s the invariant I proposed for
>>> “cluster-level repairs”—that two consecutive complete repairs must succeed
>>> within gc_grace_seconds. Theoretical risk of data resurrection is something
>>> that keeps me up at night! :).
>>>
>>> More practically, in my experience with Cassandra and Scylla clusters, I
>>> think most operators reason about repairs as “cluster-level” as opposed to
>>> individual “nodetool repair” operations, especially due to the use of
>>> Reaper for Cassandra and Scylla Manager. Reaper and Scylla Manager repairs
>>> jobs are cluster-level and repair admin+monitoring is generally at the
>>> cluster-level, e.g. cluster-level repair schedules, durations,
>>> success/completions.
>>>
>>> Repairs managed by Reaper and Scylla Manager do not guarantee a
>>> deterministic ordering or timing of individual nodetool repair operations
>>> they manage between separate cycles, breaking the "you are performing the
>>> cycles in the same order around the nodes every time” assumption. That’s
>>> the context from which my original cluster-level repair example comes from.
>>>
>>> Thanks for the helpful discussion, I will update my blog post to reflect
>>> the helpful clarifications!
>>>
>>> On Fri, May 16, 2025 at 5:25 PM Jeremiah Jordan 
>>> wrote:
>>>
 I agree we need to do a better job and wording this so people can
 understand what is happening.

 For your exact example here, you are actually looking at too broad of a
 thing.  The exact requirements are not at the full cluster level, but
 actually at the “token range” level at which repair operates, a given token
 range needs to have repair start and complete within the gc_grace sliding
 window.  For your example of a repair cycle that takes 5 days, and is
 started every 7 days, assuming you are performing that cycles in the same
 order around the nodes every time, a given node will have been repaired
 within 7 days, even though the start of repair 1 to the finish of repair 2
 was more than 7 days.  The start of “token ranges repaired on day

Re: Welcome Bret McGuire as Cassandra PMC Member!

2025-05-19 Thread Yifan Cai
Congratulations Bret!

On Mon, May 19, 2025 at 8:41 AM David Capwell  wrote:

> Congrats!
>
> On May 19, 2025, at 4:13 AM, Dmitry Konstantinov 
> wrote:
>
> Congratulations Bret!
>
> On Mon, 19 May 2025 at 00:51, Bret McGuire  wrote:
>
>>   Thank you everyone for the kind words!
>>
>>- Bret -
>>
>> On Sat, May 17, 2025 at 11:44 PM Patrick McFadin 
>> wrote:
>>
>>> Welcome Bret and congratulations!
>>>
>>> On Sat, May 17, 2025 at 10:46 AM Jaydeep Chovatia <
>>> chovatia.jayd...@gmail.com> wrote:
>>>
 Congratulations!

 Jaydeep

 On May 16, 2025, at 6:47 PM, guo Maxwell  wrote:

 
 Congrats!!

 Jon Haddad  于2025年5月17日周六 08:10写道:

> Congrats!!
>
>
> On Fri, May 16, 2025 at 4:13 PM Alexandre DUTRA 
> wrote:
>
>> Great news, and so well deserved! Congratulations Bret!
>>
>> Le ven. 16 mai 2025 à 19:00, Jeremiah Jordan  a
>> écrit :
>>
>>> Congrats Bret!
>>>
>>> On May 16, 2025 at 4:00:12 PM, Mick Semb Wever 
>>> wrote:
>>>

 The Project Management Committee (PMC) for Apache Cassandra is
 delighted to announce that Bret McGuire has joined the PMC!  Thank
 you Bret for all your contributions to the project over the years,
 especially with so many of the drivers.

 The PMC - Project Management Committee - manages and guides the
 direction of the project, and is responsible for inviting new 
 committers and
  PMC members to steward the longevity of the project.  See
 https://community.apache.org/pmc/responsibilities.html if you're
 interested in learning more about the rights and responsibilities of
  PMC members.

 Please join us in welcoming Bret McGuire to his new role in our
 project!

 Mick, on behalf of the Apache Cassandra PMC

>>>
>
> --
> Dmitry Konstantinov
>
>
>


Re: [VOTE][IP CLEARANCE] easy-cass-stress

2025-05-19 Thread Jon Haddad
The repo is now under the apache org:
https://github.com/apache/cassandra-easy-stress

On Mon, May 12, 2025 at 12:41 PM Jordan West  wrote:

> Great! With that the vote passes with 10 +1s and no -1s. We are ready to
> initiate the transfer once INFRA picks it back up. Looks like Jon has been
> working with them on it.
>
> Jordan
>
> On Mon, May 12, 2025 at 08:43 Mick Semb Wever  wrote:
>
>> Great, thanks!
>>
>> +1 on the vote.
>>
>>
>> On Mon, 12 May 2025 at 17:27, Jordan West  wrote:
>>
>>> Mick I've addressed your two comments on the PR and merged it to the
>>> main branch. I believe everything should be completed to remove your minus
>>> one but let me know if you have further concerns.
>>>
>>> Jordan
>>>
>>> On Sun, May 11, 2025 at 3:31 PM Jordan West  wrote:
>>>
 The infra ticket is here and has been open for a bit
 https://issues.apache.org/jira/browse/INFRA-26785


 Will merge the PR when Im back at my machine

 Jordan

 On Sun, May 11, 2025 at 14:44 Mick Semb Wever  wrote:

>
> .
>
>
>> That said, it has been committed. I don't see it yet reflected on the
>> live page but I assume (hope) there is some amount of auto-deployment.
>>
>
>
> I think it's hourly, and I see it now at
> https://incubator.apache.org/ip-clearance/cassandra-easy-cass-stress.html
>
>
> Please merge the PR before the org transfer.  It's how we've done it
> with the recent ip donations.  I'm thinking it's more important that we
> don't start committing anything to the project once it is under apache/
> without the correct copyright in place, than it is having the asf 
> copyright
> before being under apache/  (when people make contributions to existing
> files, it need to be implied that their contributions are copyright to the
> ASF).
>
>


Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-19 Thread Jon Haddad
We could also track the deletes that need to be made to the view, in
another SSTable component on the base.  That way you can actually do repair
with tombstones.




On Mon, May 19, 2025 at 11:37 AM Blake Eggleston 
wrote:

> If we went the storage attached route then I think you’d just need more
> memory for the memtable, compaction would just be combining 2 sorted sets,
> though there would probably be some additional work related to deletes,
> overwrites, and tombstone purging.
>
> Regarding the size of the index, I think Jon was on the right track with
> his sstable attached merkle tree idea. You don’t need to duplicate the full
> data set in the index, you just need enough info to detect that something
> is missing. If you can detect that view partition x is missing data from
> base partition y, then you could start comparing the actual partition data
> and figure out who’s missing what.
>
> On Sun, May 18, 2025, at 9:20 PM, Runtian Liu wrote:
>
> > If you had a custom SAI index or something, this isn’t something you’d
> need to worry about
> This is what I missed.
>
> I think this could be a potential solution, but comparing indexes alone
> isn’t sufficient—it only handles cases where the MV has extra or missing
> rows. It doesn’t catch data mismatches for rows that exist in both the base
> table and MV. To address that, we may need to extend SAI for MV to store
> the entire selected dataset in the index file, applying the same approach
> to MV as we do for the base table. This would increase storage to roughly
> 4x per MV, compared to the current 2x, but it would help avoid random disk
> access during repair. I’m not sure if this would introduce any memory
> issues during compaction.
>
>
>
> On Sun, May 18, 2025 at 8:09 PM Blake Eggleston 
> wrote:
>
>
> It *might* be more efficient, but it’s also more brittle. I think it
> would be more fault tolerant and less trouble overall to repair
> intersecting token ranges. So you’re not repairing a view partition, you’re
> repairing the parts of a view partition that intersect with a base table
> token range.
>
> The issues I see with the global snapshot are:
>
> 1. Requiring a global snapshot means that you can’t start a new repair
> cycle if there’s a node down.
> 2. These merkle trees can’t all be calculated at once, so we’ll need a
> coordination mechanism to spread out scans of the snapshots
> 3. By requiring a global snapshot and then building merkle trees from that
> snapshot, you’re introducing a delay of however long it takes you to do a
> full scan of both tables. So if you’re repairing your cluster every 3 days,
> it means the last range to get repaired is repairing based on a state
> that’s now 3 days old. This makes your repair horizon 2x your scheduling
> cadence and puts an upper bound on how up to date you can keep your view.
>
> With an index based approach, much of the work is just built into the
> write and compaction paths and repair is just a scan of the intersecting
> index segments from the base and view tables. You’re also repairing from
> the state that existed when you started your repair, so your repair horizon
> matches your scheduling cadence.
>
> On Sun, May 18, 2025, at 7:45 PM, Jaydeep Chovatia wrote:
>
> >Isn’t the reality here is that repairing a single partition in the base
> table is potentially a full cluster-wide scan of the MV if you also want to
> detect rows in the MV that don’t exist in the base table (eg resurrection
> or a missed delete)
> Exactly. Since materialized views (MVs) are partitioned differently from
> their base tables, there doesn’t appear to be a more efficient way to
> repair them in a targeted manner—meaning we can’t restrict the repair to
> only a small portion of the data.
>
> Jaydeep
>
> On Sun, May 18, 2025 at 5:57 PM Jeff Jirsa  wrote:
>
>
> Isn’t the reality here is that repairing a single partition in the base
> table is potentially a full cluster-wide scan of the MV if you also want to
> detect rows in the MV that don’t exist in the base table (eg resurrection
> or a missed delete)
>
> There’s no getting around that. Keeping an extra index doesn’t avoid that
> scan, it just moves the problem around to another tier.
>
>
>
> On May 18, 2025, at 4:59 PM, Blake Eggleston  wrote:
>
> 
> Whether it’s index based repair or another mechanism, I think the proposed
> repair design needs to be refined. The requirement of a global snapshot and
> merkle tree build before we can start detecting and fixing problems is a
> pretty big limitation.
>
> > Data scans during repair would become random disk accesses instead of
> sequential ones, which can degrade performance.
>
> You’d only be reading and comparing the index files, not the sstable
> contents. Reads would still be sequential.
>
> > Most importantly, I decided against this approach due to the complexity
> of ensuring index consistency. Introducing secondary indexes opens up new
> challenges, such as keeping them in sync with the actual data.
>
> I

Re: [DISCUSS] CEP-48: First-Class Materialized View Support

2025-05-19 Thread Runtian Liu
> You don’t need to duplicate the full data set in the index, you just need
enough info to detect that something is missing.
Could you please explain how this would work?
If we build Merkle trees or compute hashes at the SSTable level, how would
this case be handled?
For example, consider the following table schema:
CREATE TABLE (
  pk int PRIMARY KEY,
  v1 int,
  v2 int,
  v3 int
);

Suppose the data is stored as follows:
*Node 1*

   -

   SSTable1: (1, 1, null, 1)
   -

   SSTable2: (1, null, 1, 1)

*Node 2*

   -

   SSTable3: (1, 1, 1, 1)

How can we ensure that a hash or Merkle tree computed at the SSTable level
would produce the same result on both nodes for this row?

On Mon, May 19, 2025 at 1:54 PM Jon Haddad  wrote:

> We could also track the deletes that need to be made to the view, in
> another SSTable component on the base.  That way you can actually do repair
> with tombstones.
>
>
>
>
> On Mon, May 19, 2025 at 11:37 AM Blake Eggleston 
> wrote:
>
>> If we went the storage attached route then I think you’d just need more
>> memory for the memtable, compaction would just be combining 2 sorted sets,
>> though there would probably be some additional work related to deletes,
>> overwrites, and tombstone purging.
>>
>> Regarding the size of the index, I think Jon was on the right track with
>> his sstable attached merkle tree idea. You don’t need to duplicate the full
>> data set in the index, you just need enough info to detect that something
>> is missing. If you can detect that view partition x is missing data from
>> base partition y, then you could start comparing the actual partition data
>> and figure out who’s missing what.
>>
>> On Sun, May 18, 2025, at 9:20 PM, Runtian Liu wrote:
>>
>> > If you had a custom SAI index or something, this isn’t something you’d
>> need to worry about
>> This is what I missed.
>>
>> I think this could be a potential solution, but comparing indexes alone
>> isn’t sufficient—it only handles cases where the MV has extra or missing
>> rows. It doesn’t catch data mismatches for rows that exist in both the base
>> table and MV. To address that, we may need to extend SAI for MV to store
>> the entire selected dataset in the index file, applying the same approach
>> to MV as we do for the base table. This would increase storage to roughly
>> 4x per MV, compared to the current 2x, but it would help avoid random disk
>> access during repair. I’m not sure if this would introduce any memory
>> issues during compaction.
>>
>>
>>
>> On Sun, May 18, 2025 at 8:09 PM Blake Eggleston 
>> wrote:
>>
>>
>> It *might* be more efficient, but it’s also more brittle. I think it
>> would be more fault tolerant and less trouble overall to repair
>> intersecting token ranges. So you’re not repairing a view partition, you’re
>> repairing the parts of a view partition that intersect with a base table
>> token range.
>>
>> The issues I see with the global snapshot are:
>>
>> 1. Requiring a global snapshot means that you can’t start a new repair
>> cycle if there’s a node down.
>> 2. These merkle trees can’t all be calculated at once, so we’ll need a
>> coordination mechanism to spread out scans of the snapshots
>> 3. By requiring a global snapshot and then building merkle trees from
>> that snapshot, you’re introducing a delay of however long it takes you to
>> do a full scan of both tables. So if you’re repairing your cluster every 3
>> days, it means the last range to get repaired is repairing based on a state
>> that’s now 3 days old. This makes your repair horizon 2x your scheduling
>> cadence and puts an upper bound on how up to date you can keep your view.
>>
>> With an index based approach, much of the work is just built into the
>> write and compaction paths and repair is just a scan of the intersecting
>> index segments from the base and view tables. You’re also repairing from
>> the state that existed when you started your repair, so your repair horizon
>> matches your scheduling cadence.
>>
>> On Sun, May 18, 2025, at 7:45 PM, Jaydeep Chovatia wrote:
>>
>> >Isn’t the reality here is that repairing a single partition in the base
>> table is potentially a full cluster-wide scan of the MV if you also want to
>> detect rows in the MV that don’t exist in the base table (eg resurrection
>> or a missed delete)
>> Exactly. Since materialized views (MVs) are partitioned differently from
>> their base tables, there doesn’t appear to be a more efficient way to
>> repair them in a targeted manner—meaning we can’t restrict the repair to
>> only a small portion of the data.
>>
>> Jaydeep
>>
>> On Sun, May 18, 2025 at 5:57 PM Jeff Jirsa  wrote:
>>
>>
>> Isn’t the reality here is that repairing a single partition in the base
>> table is potentially a full cluster-wide scan of the MV if you also want to
>> detect rows in the MV that don’t exist in the base table (eg resurrection
>> or a missed delete)
>>
>> There’s no getting around that. Keeping an extra index doesn’t avoid that
>> scan, it jus

Re: [VOTE] Release Apache Cassandra 4.1.9

2025-05-19 Thread Brandon Williams
With four +1 votes (all binding) and no -1, the vote passes.  I'll get
the artifacts published.

Kind Regards,
Brandon

On Fri, May 16, 2025 at 7:32 AM Brandon Williams  wrote:
>
> Proposing the test build of Cassandra 4.1.9 for release.
>
> sha1: 82fc35b0136cc5f706032759d73ac5ae02c20871
> Git: https://github.com/apache/cassandra/tree/4.1.9-tentative
> Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1393/org/apache/cassandra/cassandra-all/4.1.9/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.1.9/
>
> The vote will be open for 72 hours (longer if needed). Everyone who
> has tested the build is invited to vote. Votes by PMC members are
> considered binding. A vote passes if there are at least three binding
> +1s and no -1's.
>
> [1]: CHANGES.txt:
> https://github.com/apache/cassandra/blob/4.1.9-tentative/CHANGES.txt
> [2]: NEWS.txt: 
> https://github.com/apache/cassandra/blob/4.1.9-tentative/NEWS.txt
>
>
> Kind Regards,
> Brandon


Re: [VOTE][IP CLEARANCE] easy-cass-stress

2025-05-19 Thread Paulo Motta
Nice work, congrats to all involved! 🎉

On Mon, May 19, 2025 at 2:59 PM Jon Haddad  wrote:

> The repo is now under the apache org:
> https://github.com/apache/cassandra-easy-stress
>
> On Mon, May 12, 2025 at 12:41 PM Jordan West  wrote:
>
>> Great! With that the vote passes with 10 +1s and no -1s. We are ready to
>> initiate the transfer once INFRA picks it back up. Looks like Jon has been
>> working with them on it.
>>
>> Jordan
>>
>> On Mon, May 12, 2025 at 08:43 Mick Semb Wever  wrote:
>>
>>> Great, thanks!
>>>
>>> +1 on the vote.
>>>
>>>
>>> On Mon, 12 May 2025 at 17:27, Jordan West  wrote:
>>>
 Mick I've addressed your two comments on the PR and merged it to the
 main branch. I believe everything should be completed to remove your minus
 one but let me know if you have further concerns.

 Jordan

 On Sun, May 11, 2025 at 3:31 PM Jordan West  wrote:

> The infra ticket is here and has been open for a bit
> https://issues.apache.org/jira/browse/INFRA-26785
>
>
> Will merge the PR when Im back at my machine
>
> Jordan
>
> On Sun, May 11, 2025 at 14:44 Mick Semb Wever  wrote:
>
>>
>> .
>>
>>
>>> That said, it has been committed. I don't see it yet reflected on
>>> the live page but I assume (hope) there is some amount of 
>>> auto-deployment.
>>>
>>
>>
>> I think it's hourly, and I see it now at
>> https://incubator.apache.org/ip-clearance/cassandra-easy-cass-stress.html
>>
>>
>> Please merge the PR before the org transfer.  It's how we've done it
>> with the recent ip donations.  I'm thinking it's more important that we
>> don't start committing anything to the project once it is under apache/
>> without the correct copyright in place, than it is having the asf 
>> copyright
>> before being under apache/  (when people make contributions to existing
>> files, it need to be implied that their contributions are copyright to 
>> the
>> ASF).
>>
>>


Re: [VOTE][IP CLEARANCE] easy-cass-stress

2025-05-19 Thread Marouan REJEB
Cool ! I've just opened a PR: update documentation links to reflect Apache
repository migration

On Tue, May 20, 2025 at 12:13 AM Paulo Motta  wrote:

> Nice work, congrats to all involved! 🎉
>
> On Mon, May 19, 2025 at 2:59 PM Jon Haddad 
> wrote:
>
>> The repo is now under the apache org:
>> https://github.com/apache/cassandra-easy-stress
>>
>> On Mon, May 12, 2025 at 12:41 PM Jordan West  wrote:
>>
>>> Great! With that the vote passes with 10 +1s and no -1s. We are ready to
>>> initiate the transfer once INFRA picks it back up. Looks like Jon has been
>>> working with them on it.
>>>
>>> Jordan
>>>
>>> On Mon, May 12, 2025 at 08:43 Mick Semb Wever  wrote:
>>>
 Great, thanks!

 +1 on the vote.


 On Mon, 12 May 2025 at 17:27, Jordan West  wrote:

> Mick I've addressed your two comments on the PR and merged it to the
> main branch. I believe everything should be completed to remove your minus
> one but let me know if you have further concerns.
>
> Jordan
>
> On Sun, May 11, 2025 at 3:31 PM Jordan West 
> wrote:
>
>> The infra ticket is here and has been open for a bit
>> https://issues.apache.org/jira/browse/INFRA-26785
>>
>>
>> Will merge the PR when Im back at my machine
>>
>> Jordan
>>
>> On Sun, May 11, 2025 at 14:44 Mick Semb Wever  wrote:
>>
>>>
>>> .
>>>
>>>
 That said, it has been committed. I don't see it yet reflected on
 the live page but I assume (hope) there is some amount of 
 auto-deployment.

>>>
>>>
>>> I think it's hourly, and I see it now at
>>> https://incubator.apache.org/ip-clearance/cassandra-easy-cass-stress.html
>>>
>>>
>>> Please merge the PR before the org transfer.  It's how we've done it
>>> with the recent ip donations.  I'm thinking it's more important that we
>>> don't start committing anything to the project once it is under apache/
>>> without the correct copyright in place, than it is having the asf 
>>> copyright
>>> before being under apache/  (when people make contributions to existing
>>> files, it need to be implied that their contributions are copyright to 
>>> the
>>> ASF).
>>>
>>>


[RELEASE] Apache Cassandra 4.1.9 released

2025-05-19 Thread Brandon Williams
The Cassandra team is pleased to announce the release of Apache
Cassandra version 4.1.9.

Apache Cassandra is a fully distributed database. It is the right
choice when you need scalability and high availability without
compromising performance.

http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download section:

http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 4.1 series. As always,
please pay attention to the release notes[2] and Let us know[3] if you
were to encounter any problem.

[WARNING] Debian and RedHat package repositories have moved! Debian
/etc/apt/sources.list.d/cassandra.sources.list and RedHat
/etc/yum.repos.d/cassandra.repo files must be updated to the new
repository URLs. For Debian it is now https:/
/debian.cassandra.apache.org . For RedHat it is now
https://redhat.cassandra.apache.org/41x/ .

Enjoy!

[1]: CHANGES.txt
https://github.com/apache/cassandra/blob/cassandra-4.1.9/CHANGES.txt
[2]: NEWS.txt https://github.com/apache/cassandra/blob/cassandra-4.1.9/NEWS.txt
[3]: https://issues.apache.org/jira/browse/CASSANDRA

Kind Regards,
Brandon