Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-27 Thread Benjamin Lerer
I would be interested in testing Maxim's approach. We need more visibility
on big features and their progress to improve our coordination. Hopefully
it will also open the door to more collaboration on those big projects.

Le jeu. 26 oct. 2023 à 21:35, German Eichberger via dev <
dev@cassandra.apache.org> a écrit :

> +1 to Maxim's idea
>
> Like Stefan my assumption was that we would get some version of TCM +
> ACCORD in 5.0 but it wouldn't be ready for production use. My own testing
> and conversations at Community over Code in Halifax confirmed this.
>
> From this perspective as disappointing as TCM+ACCORD slipping is moving it
> to 5.1 makes sense and I am supporting of this - but I am worried if 5.1 is
> basically 5.0 + TCM/ACCORD and this slips again we draw ourselves into a
> corner where we can't release 5.2 before 5.1 or something. I would like
> some more elaboration on that.
>
> I am also very worried about ANN vector search being in jeopardy for 5.0
> which is an important feature for me to win some internal company bet 🙂
>
> My 2 cents,
> German
>
> --
> *From:* Miklosovic, Stefan via dev 
> *Sent:* Thursday, October 26, 2023 4:23 AM
> *To:* dev@cassandra.apache.org 
> *Cc:* Miklosovic, Stefan 
> *Subject:* [EXTERNAL] Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1
> (and cut an immediate 5.1-alpha1)
>
> What Maxim proposes in the last paragraph would be definitely helpful. Not
> for the project only but for a broader audience, companies etc., too.
>
> Until this thread was started, my assumption was that "there will be 5.0
> on summit with TCM and Accord and it somehow just happens". More
> transparent communication where we are at with high-profile CEPs like these
> and knowing if deadlines are going to be met would be welcome.
>
> I don't want to be that guy and don't take me wrong here, but really,
> these CEPs are being developed, basically, by devs from two companies,
> which have developers who do not have any real need to explain themselves
> like what they do, regularly, to outsiders. (or maybe you do, you just
> don't have time?) I get that. But on the other hand, you can not
> realistically expect that other folks will have any visibility into what is
> going on there and that there is a delay on the horizon and so on.
>
> 
> From: Maxim Muzafarov 
> Sent: Thursday, October 26, 2023 12:21
> To: dev@cassandra.apache.org
> Subject: Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an
> immediate 5.1-alpha1)
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> Personally, I think frequent releases (2-3 per year) are better than
> infrequent big releases. I can understand all the concerns from a
> marketing perspective, as smaller major releases may not shine as
> brightly as a single "game changer" release. However, smaller
> releases, especially if they don't have backwards compatibility
> issues, are better for the engineering and SRE teams because if a
> long-awaited feature is delayed for any reason, there should be no
> worry about getting it in right into the next release.
>
> An analogy here might be that if you miss your train (small release)
> due to circumstances, you can wait right here for the next one, but if
> you miss a flight (big release), you will go back home :-) This is why
> I think that the 5.0, 5.1, 5.2, etc. are better and I support Mick's
> plan with the caveat that we should release 5.1 when we think we are
> ready to do so. Here is an example of the Postgres releases [1].
>
> [1]
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbucardo.org%2Fpostgres_all_versions.html&data=05%7C01%7CGerman.Eichberger%40microsoft.com%7Cc811f6a430d1466acc3f08dbd61639c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638339163187354112%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zjMpuN%2FQMhBtFTemLswn8BRaLyQ9eLZTIeZfeWYwhQk%3D&reserved=0
> 
>
>
> Another little thing that I'd like to mention is a release management
> story. In the Apache Ignite project, we've got used to creating a
> release thread and posting the release status updates and/or problems,
> and/or delays there, and maybe some of the benchmarks at the end. Of
> course, this was done by the release manager who volunteered to do
> this work. I'm not saying we're doing anything wrong here, no, but the
> publicity and openness, coupled with regular updates, could help
> create a real sense of the remaining work in progress. These are my
> personal feelings, and definitely not actions to be taken. The example
> is here: [2].
>
> [2]
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread%2Fm11m0nxq701f2cj8xxdcsc4nnn2sm8ql&data=05%7C01%7CGerman.Eichberger%40micros

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-27 Thread Jacek Lewandowski
I've been thinking about this and I believe that if we ever decide to delay
a release to include some CEPs, we should make the plan and status of those
CEPs public. This should include publishing a branch, creating tickets for
the remaining work required for feature completion in Jira, and notifying
the mailing list.

By doing this, we can make an informed decision about whether delivering a
CEP in a release x.y planned for some time z is feasible. This approach
would also be beneficial for improving collaboration, as we will all be
aware of what is left to be done and can adjust our focus accordingly to
participate in the remaining work.

Thanks,
- - -- --- -  -
Jacek Lewandowski


pt., 27 paź 2023 o 10:26 Benjamin Lerer  napisał(a):

> I would be interested in testing Maxim's approach. We need more visibility
> on big features and their progress to improve our coordination. Hopefully
> it will also open the door to more collaboration on those big projects.
>
> Le jeu. 26 oct. 2023 à 21:35, German Eichberger via dev <
> dev@cassandra.apache.org> a écrit :
>
>> +1 to Maxim's idea
>>
>> Like Stefan my assumption was that we would get some version of TCM +
>> ACCORD in 5.0 but it wouldn't be ready for production use. My own testing
>> and conversations at Community over Code in Halifax confirmed this.
>>
>> From this perspective as disappointing as TCM+ACCORD slipping is moving
>> it to 5.1 makes sense and I am supporting of this - but I am worried if 5.1
>> is basically 5.0 + TCM/ACCORD and this slips again we draw ourselves into a
>> corner where we can't release 5.2 before 5.1 or something. I would like
>> some more elaboration on that.
>>
>> I am also very worried about ANN vector search being in jeopardy for 5.0
>> which is an important feature for me to win some internal company bet 🙂
>>
>> My 2 cents,
>> German
>>
>> --
>> *From:* Miklosovic, Stefan via dev 
>> *Sent:* Thursday, October 26, 2023 4:23 AM
>> *To:* dev@cassandra.apache.org 
>> *Cc:* Miklosovic, Stefan 
>> *Subject:* [EXTERNAL] Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1
>> (and cut an immediate 5.1-alpha1)
>>
>> What Maxim proposes in the last paragraph would be definitely helpful.
>> Not for the project only but for a broader audience, companies etc., too.
>>
>> Until this thread was started, my assumption was that "there will be 5.0
>> on summit with TCM and Accord and it somehow just happens". More
>> transparent communication where we are at with high-profile CEPs like these
>> and knowing if deadlines are going to be met would be welcome.
>>
>> I don't want to be that guy and don't take me wrong here, but really,
>> these CEPs are being developed, basically, by devs from two companies,
>> which have developers who do not have any real need to explain themselves
>> like what they do, regularly, to outsiders. (or maybe you do, you just
>> don't have time?) I get that. But on the other hand, you can not
>> realistically expect that other folks will have any visibility into what is
>> going on there and that there is a delay on the horizon and so on.
>>
>> 
>> From: Maxim Muzafarov 
>> Sent: Thursday, October 26, 2023 12:21
>> To: dev@cassandra.apache.org
>> Subject: Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an
>> immediate 5.1-alpha1)
>>
>> NetApp Security WARNING: This is an external email. Do not click links or
>> open attachments unless you recognize the sender and know the content is
>> safe.
>>
>>
>>
>>
>> Personally, I think frequent releases (2-3 per year) are better than
>> infrequent big releases. I can understand all the concerns from a
>> marketing perspective, as smaller major releases may not shine as
>> brightly as a single "game changer" release. However, smaller
>> releases, especially if they don't have backwards compatibility
>> issues, are better for the engineering and SRE teams because if a
>> long-awaited feature is delayed for any reason, there should be no
>> worry about getting it in right into the next release.
>>
>> An analogy here might be that if you miss your train (small release)
>> due to circumstances, you can wait right here for the next one, but if
>> you miss a flight (big release), you will go back home :-) This is why
>> I think that the 5.0, 5.1, 5.2, etc. are better and I support Mick's
>> plan with the caveat that we should release 5.1 when we think we are
>> ready to do so. Here is an example of the Postgres releases [1].
>>
>> [1]
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbucardo.org%2Fpostgres_all_versions.html&data=05%7C01%7CGerman.Eichberger%40microsoft.com%7Cc811f6a430d1466acc3f08dbd61639c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638339163187354112%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zjMpuN%2FQMhBtFTemLswn8BRaLyQ9eLZTIeZfeWYwhQk%3D&reserved=0
>> 

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-27 Thread Alex Petrov
Firstly, when talking about quality, many folks mention the risk of releasing 
bugs together with TCM and Accord. While I agree this risk is real, I would 
like to remind that TCM and Accord were extensively tested and simulated, for 
*many* hours. Just an example, we’ve recently filed an issue we found during 
our Harry testing, and this issue was introduced outside TCM. So a lot of 
validation work will be invalidated by having two releases. And I am not 
certain how many organisations have capacity for internally vetting two 
subsequent major releases. But I am also not compeltely opposed to having 5.0 
and 5.1 if this is something a majority of contributors prefers.

As regards developing CEPs in public, this is how it was already done this 
time. Both Accord and TCM were published for a substantial amount of time. I 
did read an argument that the code was somehow not ready for review, but by the 
same logic neither is it ready right now, as the interesting parts haven't 
changed in a while. Even the feedback on the CEP itself, which was published in 
April 2023, was minimal. There were multiple sessions about the TCM and Accord 
in New Orleans in 2023, and the interested parties (including many folks form 
this discussion) couldn't help but learn about their status and progress. 
Still, there was very little engagement (which, I claim, is absolutely fine). 
So, since one can't say that we (collectively) are not pubslihing CEPs and code 
early enough, the only argument is that the people choose to prioritise things 
based on what is important for their businesses today, and this is, again, 
completely fine.

If you are interested in a CEP, make sure you engage with its authors from the 
first time they publish something. There are many patches and CEPs I wish I 
have reviewed, but did not have time for. For those, I am reading the available 
discussions, talking to their authors, and writing Harry tests. I would not, 
however, ask someone to postpone a feature based on my past or future 
availability.

On Fri, Oct 27, 2023, at 10:14 AM, Jacek Lewandowski wrote:
> I've been thinking about this and I believe that if we ever decide to delay a 
> release to include some CEPs, we should make the plan and status of those 
> CEPs public. This should include publishing a branch, creating tickets for 
> the remaining work required for feature completion in Jira, and notifying the 
> mailing list.
> 
> By doing this, we can make an informed decision about whether delivering a 
> CEP in a release x.y planned for some time z is feasible. This approach would 
> also be beneficial for improving collaboration, as we will all be aware of 
> what is left to be done and can adjust our focus accordingly to participate 
> in the remaining work.
> 
> Thanks,
> - - -- --- -  -
> Jacek Lewandowski
> 
> 
> pt., 27 paź 2023 o 10:26 Benjamin Lerer  napisał(a):
>> I would be interested in testing Maxim's approach. We need more visibility 
>> on big features and their progress to improve our coordination. Hopefully it 
>> will also open the door to more collaboration on those big projects.
>> 
>> Le jeu. 26 oct. 2023 à 21:35, German Eichberger via dev 
>>  a écrit :
>>> +1 to Maxim's idea
>>> 
>>> Like Stefan my assumption was that we would get some version of TCM + 
>>> ACCORD in 5.0 but it wouldn't be ready for production use. My own testing 
>>> and conversations at Community over Code in Halifax confirmed this.
>>> 
>>> From this perspective as disappointing as TCM+ACCORD slipping is moving it 
>>> to 5.1 makes sense and I am supporting of this - but I am worried if 5.1 is 
>>> basically 5.0 + TCM/ACCORD and this slips again we draw ourselves into a 
>>> corner where we can't release 5.2 before 5.1 or something. I would like 
>>> some more elaboration on that.
>>> 
>>> I am also very worried about ANN vector search being in jeopardy for 5.0 
>>> which is an important feature for me to win some internal company bet 🙂
>>> 
>>> My 2 cents,
>>> German
>>> 
>>> 
>>> *From:* Miklosovic, Stefan via dev 
>>> *Sent:* Thursday, October 26, 2023 4:23 AM
>>> *To:* dev@cassandra.apache.org 
>>> *Cc:* Miklosovic, Stefan 
>>> *Subject:* [EXTERNAL] Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and 
>>> cut an immediate 5.1-alpha1)
>>>  
>>> What Maxim proposes in the last paragraph would be definitely helpful. Not 
>>> for the project only but for a broader audience, companies etc., too.
>>> 
>>> Until this thread was started, my assumption was that "there will be 5.0 on 
>>> summit with TCM and Accord and it somehow just happens". More transparent 
>>> communication where we are at with high-profile CEPs like these and knowing 
>>> if deadlines are going to be met would be welcome.
>>> 
>>> I don't want to be that guy and don't take me wrong here, but really, these 
>>> CEPs are being developed, basically, by devs from two companies, which have 
>>> developers who do not have any real need to explain themselves l

Project Status Update: 90-day catch-up edition [2023-10-27]

2023-10-27 Thread Josh McKenzie
In case you're keeping score on how frequently these are coming out: *please 
stop*. ;)

Silver lining - looks like we have a lot to discuss this round! Last update was 
late July and we've been churning through the 5.0 freeze and stabilization 
phase.


*[New Contributors Getting Started]
*
Check out https://the-asf.slack.com, channel #cassandra-dev. Reply directly to 
me on this email if you need an invite for your account, and reach out to the 
@cassandra_mentors alias in the channel if you need to get oriented.

We have a list of curated "getting started" tickets you can find here, filtered 
to "ToDo" (i.e. not yet worked): 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2160&quickFilter=2162&quickFilter=2652.

*Helpful links:**
*
- Getting Started with Development on C*: 
https://cassandra.apache.org/_/development/gettingstarted.html
- Building and IDE integration (worktrees are your friend; msg me on slack if 
you need pointers): https://cassandra.apache.org/_/development/ide.html
- Code Style: https://cassandra.apache.org/_/development/code_style.html


*[Dev mailing list]
*
https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-7-20%7Cdto=2023-10-27:

My last email of shame was 35 threads. Drumroll for this one...
91. *Yeesh*. Let me stick to highlights.

Ekaterina pushed through dropping JDK8 support and adding JDK17 support... back 
in July. If you didn't know about it by know, consider yourself doubly 
notified. :) . https://lists.apache.org/thread/9pwz3vtpf88fly27psc7yxvcv0lwbz8k 
I think I can speak on behalf of all of us when I say: **Thank You Ekaterina.**

This came up recently on another thread about when to branch 5.1, but we 
discussed our freeze plans and exception rules for TCM and Accord here: 
https://lists.apache.org/thread/mzj3dq8b7mzf60k6mkby88b9n9ywmsgw. Mick was 
essentially looking for a similar waiver for Vector search since it was well 
abstracted, depended on SAI and external libs, and in general shouldn't be too 
big of a disruption to get into 5.0. General consensus at the time was "sure", 
and the work has since been completed. But here's the reminder and link for 
posterity (and in case you missed it).

Jaydeep reached out about a potential short-term solution to detecting 
token-ownership mismatch while we don't yet have TCM; this seems more pressing 
now as we're looking at a 5.0 without yet having TCM in it. The dev ML thread 
is here: https://lists.apache.org/thread/4p0orhom42g36osnknqj3fqmqhvqml1g, and 
he created https://issues.apache.org/jira/browse/CASSANDRA-18758 dealing with 
the topic. There's a relatively modest (7 files, just over 300 lines) PR 
available here: https://github.com/apache/cassandra/pull/2595/files; I haven't 
looked into it, but it might be worth considering getting this into 5.0 since 
it looks like we're moving to cutting w/out TCM. Any thoughts?

We had a pretty good discussion about automated repair scheduling, discussing 
whether it should live in the DB proper vs. in the sidecar, pros and cons, 
pressures, etc. Not sure if things moved beyond that; I know there's at least a 
few implementations out there that haven't yet made their way back to the ASF 
project proper. Thread: 
https://lists.apache.org/thread/glvmkwknf91rxc5l6w4d4m1kcvlr6mrv. My hope is we 
can avoid the gridlock we hit for a long time with the sidecar where there are 
multiple implementations with different tradeoffs and everyone's 
disincentivized from accepting a solution different from their own in-house one 
since it'd theoretically require re-tooling. Tough problem with no easy 
solutions, but would love to see this become a first class citizen in the 
ecosystem.

Paulo brought up a discussion about moving to disk_access_mode = 
mmap_index_only on 5.0. Seemed to be a consensus there but I'm not sure we 
actually changed that in the 5.0 branch? Thread: 
https://lists.apache.org/thread/nhp6vftc4kc3dxskngxy5rpo1lp19drw. Just pulled 
on cassandra-5.0 and it looks like auto + hasLargeAddressSpace() == .mmap 
rather than .mmap_index_only.

David Capwell worked on adding some retries to repair messages when they're 
failing to make the process more robust: 
https://lists.apache.org/thread/wxv6k6slljqcw73xcmpxj4kn5lz95jd1. Reception was 
positive enough that he went so far as to back-port it and also work on some 
for IR. Looks like he could use a reviewer here: 
https://issues.apache.org/jira/browse/CASSANDRA-18962 - and this is patch 
available.

Mike Adamson reached out about adding / taking a dependency on jvector: 
https://lists.apache.org/thread/zkqg7mk9hp35zn0cf1tvywc2m3l63jrn. The general 
gist of it was "looks good, written by committer(s) / pmc members, permissvely 
licensed. Go for it". Some discussion about copyright holders and whether that 
matters from an ASF perspective, and we've further had some good discussion 
about the application of generative AI tooling to not just code contributed to 
the ASF, but also in dep

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-27 Thread Josh McKenzie
Lots of threads of thought have popped up here. The big one I feel needs to be 
clearly addressed and inspected is the implication of development not happening 
transparently and not being inclusive or available for participation by the 
community on major features.

The CEP process + dedicated development channels on ASF slack + public JIRA's + 
feature branches in the ASF repo we've seen with specifically TCM and Accord 
are the most transparent this kind of development has *ever been* on this 
project, and I'd argue right at the sweet spot or past where the degree of 
reaching out to external parties to get feedback starts to not just hit 
diminishing returns but starts to actively hurt a small group of peoples' 
ability to make rapid progress on something.

No-one can expect to review everything, and no-one can expect to follow every 
JIRA, commit, or update. This is why we have the role of a committer; a person 
in this community we've publicly communicated we trust based on earned merit 
(and in our project's case, at least 2 people who's opinion we trust) to do 
quality work, validate it, and reach our expected bar for correctness, 
performance, and maintainability. If a CEP is voted in and 2 committers have an 
implementation they feel meets the goals, CI is green, and nobody has a serious 
technical concern that warrants a binding -1, we're good. It doesn't, and 
shouldn't, matter who currently employs or sponsors their work. It doesn't, and 
shouldn't, matter whether individuals on the project who were interested in 
collaborating on that work missed one or multiple announcements, or whether 
they saw those announcements and just didn't have the cycles to engage when 
they wanted to.

Now - we can always improve. We can always try and be proactive, knowing each 
other and our interests and reaching out to specific folks to make sure they're 
aware that work has hit a collaboration point or inflection point. I can do 
(apparently much) better about sending out more consistent project status 
updates with calls to action around when these inflection points occur as well.

At the end of the day, this is an Apache project, and trust and lazy consensus 
are the backbone for how a lot of this stuff works when distributed and at 
scale.

On Fri, Oct 27, 2023, at 10:51 AM, Alex Petrov wrote:
> Firstly, when talking about quality, many folks mention the risk of releasing 
> bugs together with TCM and Accord. While I agree this risk is real, I would 
> like to remind that TCM and Accord were extensively tested and simulated, for 
> *many* hours. Just an example, we’ve recently filed an issue we found during 
> our Harry testing, and this issue was introduced outside TCM. So a lot of 
> validation work will be invalidated by having two releases. And I am not 
> certain how many organisations have capacity for internally vetting two 
> subsequent major releases. But I am also not compeltely opposed to having 5.0 
> and 5.1 if this is something a majority of contributors prefers.
> 
> As regards developing CEPs in public, this is how it was already done this 
> time. Both Accord and TCM were published for a substantial amount of time. I 
> did read an argument that the code was somehow not ready for review, but by 
> the same logic neither is it ready right now, as the interesting parts 
> haven't changed in a while. Even the feedback on the CEP itself, which was 
> published in April 2023, was minimal. There were multiple sessions about the 
> TCM and Accord in New Orleans in 2023, and the interested parties (including 
> many folks form this discussion) couldn't help but learn about their status 
> and progress. Still, there was very little engagement (which, I claim, is 
> absolutely fine). So, since one can't say that we (collectively) are not 
> pubslihing CEPs and code early enough, the only argument is that the people 
> choose to prioritise things based on what is important for their businesses 
> today, and this is, again, completely fine.
> 
> If you are interested in a CEP, make sure you engage with its authors from 
> the first time they publish something. There are many patches and CEPs I wish 
> I have reviewed, but did not have time for. For those, I am reading the 
> available discussions, talking to their authors, and writing Harry tests. I 
> would not, however, ask someone to postpone a feature based on my past or 
> future availability.
> 
> On Fri, Oct 27, 2023, at 10:14 AM, Jacek Lewandowski wrote:
>> I've been thinking about this and I believe that if we ever decide to delay 
>> a release to include some CEPs, we should make the plan and status of those 
>> CEPs public. This should include publishing a branch, creating tickets for 
>> the remaining work required for feature completion in Jira, and notifying 
>> the mailing list.
>> 
>> By doing this, we can make an informed decision about whether delivering a 
>> CEP in a release x.y planned for some time z is feasible. This approach 
>

Re: Project Status Update: 90-day catch-up edition [2023-10-27]

2023-10-27 Thread Sam
Please can I have an invite to the Slack workspace on this email. I'd like
to take a look through some of the items for first time contributors :-)

Thanks!

On Fri, 27 Oct 2023 at 18:10, Josh McKenzie  wrote:

> In case you're keeping score on how frequently these are coming out: *please
> stop*. ;)
>
> Silver lining - looks like we have a lot to discuss this round! Last
> update was late July and we've been churning through the 5.0 freeze and
> stabilization phase.
>
>
>
> *[New Contributors Getting Started]*
> Check out https://the-asf.slack.com, channel #cassandra-dev. Reply
> directly to me on this email if you need an invite for your account, and
> reach out to the @cassandra_mentors alias in the channel if you need to get
> oriented.
>
> We have a list of curated "getting started" tickets you can find here,
> filtered to "ToDo" (i.e. not yet worked):
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2160&quickFilter=2162&quickFilter=2652
> .
>
> *Helpful links:*
> - Getting Started with Development on C*:
> https://cassandra.apache.org/_/development/gettingstarted.html
> - Building and IDE integration (worktrees are your friend; msg me on slack
> if you need pointers): https://cassandra.apache.org/_/development/ide.html
> - Code Style: https://cassandra.apache.org/_/development/code_style.html
>
>
>
> *[Dev mailing list]*
>
> https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-7-20%7Cdto=2023-10-27
> :
>
> My last email of shame was 35 threads. Drumroll for this one...
> 91. *Yeesh*. Let me stick to highlights.
>
> Ekaterina pushed through dropping JDK8 support and adding JDK17 support...
> back in July. If you didn't know about it by know, consider yourself doubly
> notified. :) .
> https://lists.apache.org/thread/9pwz3vtpf88fly27psc7yxvcv0lwbz8k I think
> I can speak on behalf of all of us when I say: *Thank You Ekaterina.*
>
> This came up recently on another thread about when to branch 5.1, but we
> discussed our freeze plans and exception rules for TCM and Accord here:
> https://lists.apache.org/thread/mzj3dq8b7mzf60k6mkby88b9n9ywmsgw. Mick
> was essentially looking for a similar waiver for Vector search since it was
> well abstracted, depended on SAI and external libs, and in general
> shouldn't be too big of a disruption to get into 5.0. General consensus at
> the time was "sure", and the work has since been completed. But here's the
> reminder and link for posterity (and in case you missed it).
>
> Jaydeep reached out about a potential short-term solution to detecting
> token-ownership mismatch while we don't yet have TCM; this seems more
> pressing now as we're looking at a 5.0 without yet having TCM in it. The
> dev ML thread is here:
> https://lists.apache.org/thread/4p0orhom42g36osnknqj3fqmqhvqml1g, and he
> created https://issues.apache.org/jira/browse/CASSANDRA-18758 dealing
> with the topic. There's a relatively modest (7 files, just over 300 lines)
> PR available here: https://github.com/apache/cassandra/pull/2595/files; I
> haven't looked into it, but it might be worth considering getting this into
> 5.0 since it looks like we're moving to cutting w/out TCM. Any thoughts?
>
> We had a pretty good discussion about automated repair scheduling,
> discussing whether it should live in the DB proper vs. in the sidecar, pros
> and cons, pressures, etc. Not sure if things moved beyond that; I know
> there's at least a few implementations out there that haven't yet made
> their way back to the ASF project proper. Thread:
> https://lists.apache.org/thread/glvmkwknf91rxc5l6w4d4m1kcvlr6mrv. My hope
> is we can avoid the gridlock we hit for a long time with the sidecar where
> there are multiple implementations with different tradeoffs and everyone's
> disincentivized from accepting a solution different from their own in-house
> one since it'd theoretically require re-tooling. Tough problem with no easy
> solutions, but would love to see this become a first class citizen in the
> ecosystem.
>
> Paulo brought up a discussion about moving to disk_access_mode =
> mmap_index_only on 5.0. Seemed to be a consensus there but I'm not sure we
> actually changed that in the 5.0 branch? Thread:
> https://lists.apache.org/thread/nhp6vftc4kc3dxskngxy5rpo1lp19drw. Just
> pulled on cassandra-5.0 and it looks like auto + hasLargeAddressSpace() ==
> .mmap rather than .mmap_index_only.
>
> David Capwell worked on adding some retries to repair messages when
> they're failing to make the process more robust:
> https://lists.apache.org/thread/wxv6k6slljqcw73xcmpxj4kn5lz95jd1.
> Reception was positive enough that he went so far as to back-port it and
> also work on some for IR. Looks like he could use a reviewer here:
> https://issues.apache.org/jira/browse/CASSANDRA-18962 - and this is patch
> available.
>
> Mike Adamson reached out about adding / taking a dependency on jvector:
> https://lists.apache.org/thread/zkqg7mk9hp35zn0cf1tvywc2m3l63jrn. The
> general gist of it wa

Re: Project Status Update: 90-day catch-up edition [2023-10-27]

2023-10-27 Thread Patrick McFadin
Sent you an invite Sam. Welcome to the community!

On Fri, Oct 27, 2023 at 10:31 AM Sam  wrote:

> Please can I have an invite to the Slack workspace on this email. I'd like
> to take a look through some of the items for first time contributors :-)
>
> Thanks!
>
> On Fri, 27 Oct 2023 at 18:10, Josh McKenzie  wrote:
>
>> In case you're keeping score on how frequently these are coming out: *please
>> stop*. ;)
>>
>> Silver lining - looks like we have a lot to discuss this round! Last
>> update was late July and we've been churning through the 5.0 freeze and
>> stabilization phase.
>>
>>
>>
>> *[New Contributors Getting Started]*
>> Check out https://the-asf.slack.com, channel #cassandra-dev. Reply
>> directly to me on this email if you need an invite for your account, and
>> reach out to the @cassandra_mentors alias in the channel if you need to get
>> oriented.
>>
>> We have a list of curated "getting started" tickets you can find here,
>> filtered to "ToDo" (i.e. not yet worked):
>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2160&quickFilter=2162&quickFilter=2652
>> .
>>
>> *Helpful links:*
>> - Getting Started with Development on C*:
>> https://cassandra.apache.org/_/development/gettingstarted.html
>> - Building and IDE integration (worktrees are your friend; msg me on
>> slack if you need pointers):
>> https://cassandra.apache.org/_/development/ide.html
>> - Code Style: https://cassandra.apache.org/_/development/code_style.html
>>
>>
>>
>> *[Dev mailing list]*
>>
>> https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-7-20%7Cdto=2023-10-27
>> :
>>
>> My last email of shame was 35 threads. Drumroll for this one...
>> 91. *Yeesh*. Let me stick to highlights.
>>
>> Ekaterina pushed through dropping JDK8 support and adding JDK17
>> support... back in July. If you didn't know about it by know, consider
>> yourself doubly notified. :) .
>> https://lists.apache.org/thread/9pwz3vtpf88fly27psc7yxvcv0lwbz8k I think
>> I can speak on behalf of all of us when I say: *Thank You Ekaterina.*
>>
>> This came up recently on another thread about when to branch 5.1, but we
>> discussed our freeze plans and exception rules for TCM and Accord here:
>> https://lists.apache.org/thread/mzj3dq8b7mzf60k6mkby88b9n9ywmsgw. Mick
>> was essentially looking for a similar waiver for Vector search since it was
>> well abstracted, depended on SAI and external libs, and in general
>> shouldn't be too big of a disruption to get into 5.0. General consensus at
>> the time was "sure", and the work has since been completed. But here's the
>> reminder and link for posterity (and in case you missed it).
>>
>> Jaydeep reached out about a potential short-term solution to detecting
>> token-ownership mismatch while we don't yet have TCM; this seems more
>> pressing now as we're looking at a 5.0 without yet having TCM in it. The
>> dev ML thread is here:
>> https://lists.apache.org/thread/4p0orhom42g36osnknqj3fqmqhvqml1g, and he
>> created https://issues.apache.org/jira/browse/CASSANDRA-18758 dealing
>> with the topic. There's a relatively modest (7 files, just over 300 lines)
>> PR available here: https://github.com/apache/cassandra/pull/2595/files;
>> I haven't looked into it, but it might be worth considering getting this
>> into 5.0 since it looks like we're moving to cutting w/out TCM. Any
>> thoughts?
>>
>> We had a pretty good discussion about automated repair scheduling,
>> discussing whether it should live in the DB proper vs. in the sidecar, pros
>> and cons, pressures, etc. Not sure if things moved beyond that; I know
>> there's at least a few implementations out there that haven't yet made
>> their way back to the ASF project proper. Thread:
>> https://lists.apache.org/thread/glvmkwknf91rxc5l6w4d4m1kcvlr6mrv. My
>> hope is we can avoid the gridlock we hit for a long time with the sidecar
>> where there are multiple implementations with different tradeoffs and
>> everyone's disincentivized from accepting a solution different from their
>> own in-house one since it'd theoretically require re-tooling. Tough problem
>> with no easy solutions, but would love to see this become a first class
>> citizen in the ecosystem.
>>
>> Paulo brought up a discussion about moving to disk_access_mode =
>> mmap_index_only on 5.0. Seemed to be a consensus there but I'm not sure we
>> actually changed that in the 5.0 branch? Thread:
>> https://lists.apache.org/thread/nhp6vftc4kc3dxskngxy5rpo1lp19drw. Just
>> pulled on cassandra-5.0 and it looks like auto + hasLargeAddressSpace() ==
>> .mmap rather than .mmap_index_only.
>>
>> David Capwell worked on adding some retries to repair messages when
>> they're failing to make the process more robust:
>> https://lists.apache.org/thread/wxv6k6slljqcw73xcmpxj4kn5lz95jd1.
>> Reception was positive enough that he went so far as to back-port it and
>> also work on some for IR. Looks like he could use a reviewer here:
>> https://issues.apache.org/jira/browse/CASSANDRA-18962 -

Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut an immediate 5.1-alpha1)

2023-10-27 Thread German Eichberger via dev
Definitely want to second Josh. When I reached out on the ACCORD channel about 
testing folks were super helpful and transparent about bugs, etc.

Frankly, I was pretty frustrated that ACCORD+TCM slipped. I was looking forward 
to it and felt let down - but I also haven't done anything to help other than 
trying it out. So, I only have myself to blame...

That there was a surprise for many of us that it slipped is an indication there 
wasn't enough communication - we should probably rethink how we communicate 
progress, especially on long running and highly anticipated initiatives. Maybe 
a paragraph in the "Project Status Update" (but then we need more frequent 
updates 🙂) -- or send a separate update e-mail or as Maxim is suggesting to 
some newly created release list.

A highly anticipated feature has more visibility and we need to account for 
that with more communication other than the usual channels. ACCORD in 
particular was hyped in numerous talks and presentations and noone cautioned it 
might not hit 5.0, quite the opposite --so we need to ask ourselves how people 
who go on stage as Cassandra experts are not aware that it could slip. That's 
where I think more communication could help --


Thanks,
German





From: Josh McKenzie 
Sent: Friday, October 27, 2023 10:13 AM
To: dev 
Subject: [EXTERNAL] Re: Push TCM (CEP-21) and Accord (CEP-15) to 5.1 (and cut 
an immediate 5.1-alpha1)

Lots of threads of thought have popped up here. The big one I feel needs to be 
clearly addressed and inspected is the implication of development not happening 
transparently and not being inclusive or available for participation by the 
community on major features.

The CEP process + dedicated development channels on ASF slack + public JIRA's + 
feature branches in the ASF repo we've seen with specifically TCM and Accord 
are the most transparent this kind of development has ever been on this 
project, and I'd argue right at the sweet spot or past where the degree of 
reaching out to external parties to get feedback starts to not just hit 
diminishing returns but starts to actively hurt a small group of peoples' 
ability to make rapid progress on something.

No-one can expect to review everything, and no-one can expect to follow every 
JIRA, commit, or update. This is why we have the role of a committer; a person 
in this community we've publicly communicated we trust based on earned merit 
(and in our project's case, at least 2 people who's opinion we trust) to do 
quality work, validate it, and reach our expected bar for correctness, 
performance, and maintainability. If a CEP is voted in and 2 committers have an 
implementation they feel meets the goals, CI is green, and nobody has a serious 
technical concern that warrants a binding -1, we're good. It doesn't, and 
shouldn't, matter who currently employs or sponsors their work. It doesn't, and 
shouldn't, matter whether individuals on the project who were interested in 
collaborating on that work missed one or multiple announcements, or whether 
they saw those announcements and just didn't have the cycles to engage when 
they wanted to.

Now - we can always improve. We can always try and be proactive, knowing each 
other and our interests and reaching out to specific folks to make sure they're 
aware that work has hit a collaboration point or inflection point. I can do 
(apparently much) better about sending out more consistent project status 
updates with calls to action around when these inflection points occur as well.

At the end of the day, this is an Apache project, and trust and lazy consensus 
are the backbone for how a lot of this stuff works when distributed and at 
scale.

On Fri, Oct 27, 2023, at 10:51 AM, Alex Petrov wrote:
Firstly, when talking about quality, many folks mention the risk of releasing 
bugs together with TCM and Accord. While I agree this risk is real, I would 
like to remind that TCM and Accord were extensively tested and simulated, for 
many hours. Just an example, we’ve recently filed an issue we found during our 
Harry testing, and this issue was introduced outside TCM. So a lot of 
validation work will be invalidated by having two releases. And I am not 
certain how many organisations have capacity for internally vetting two 
subsequent major releases. But I am also not compeltely opposed to having 5.0 
and 5.1 if this is something a majority of contributors prefers.

As regards developing CEPs in public, this is how it was already done this 
time. Both Accord and TCM were published for a substantial amount of time. I 
did read an argument that the code was somehow not ready for review, but by the 
same logic neither is it ready right now, as the interesting parts haven't 
changed in a while. Even the feedback on the CEP itself, which was published in 
April 2023, was minimal. There were multiple sessions about the TCM and Accord 
in New Orleans in 2023, and the interested parties (including many fo