Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-08 Thread Yifan Cai
+1

From: Jon Haddad 
Sent: Tuesday, February 7, 2023 4:55:51 PM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] CEP-21 Transactional Cluster Metadata

+1

On 2023/02/06 16:15:19 Sam Tunnicliffe wrote:
> Hi everyone,
>
> I would like to start a vote on this CEP.
>
> Proposal:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
>
> Discussion:
> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
>
> The vote will be open for 72 hours.
> A vote passes if there are at least three binding +1s and no binding vetoes.
>
> Thanks,
> Sam


Re: Welcome our next PMC Chair Josh McKenzie

2023-03-23 Thread Yifan Cai
Congratulations Josh!

From: Melissa Logan 
Sent: Thursday, March 23, 2023 8:04:01 AM
To: dev 
Subject: Re: Welcome our next PMC Chair Josh McKenzie

Josh, congratulations! Mick, thank you for all your efforts and support!

On Thu, Mar 23, 2023, 07:58 Joseph Lynch 
mailto:joe.e.ly...@gmail.com>> wrote:
Congratulations Josh! Thank you Mick!

-Joey

On Thu, Mar 23, 2023 at 10:56 AM Molly Monroy 
mailto:mo...@constantia.io>> wrote:
Congrats Josh - looking forward to working with you more closely! It's been a 
pleasure, Mick!

On Thu, Mar 23, 2023 at 8:32 AM Josh McKenzie 
mailto:jmcken...@apache.org>> wrote:
Definitely want to +1 the appreciation for all the work Mick's put into the 
role.

Looking forward to continuing to help out where I can!

On Thu, Mar 23, 2023, at 9:27 AM, J. D. Jordan wrote:

Congrats Josh!

And thanks Mick for your time spent as Chair!

On Mar 23, 2023, at 8:21 AM, Aaron Ploetz 
mailto:aaronplo...@gmail.com>> wrote:

Congratulations, Josh!

And of course, thank you Mick for all you've done for the project while in the 
PMC Chair role!

On Thu, Mar 23, 2023 at 7:44 AM Derek Chen-Becker 
mailto:de...@chen-becker.org>> wrote:
Congratulations, Josh!

On Thu, Mar 23, 2023, 4:23 AM Mick Semb Wever 
mailto:m...@apache.org>> wrote:
It is time to pass the baton on, and on behalf of the Apache Cassandra Project 
Management Committee (PMC) I would like to welcome and congratulate our next 
PMC Chair Josh McKenzie (jmckenzie).

Most of you already know Josh, especially through his regular and valuable 
project oversight and status emails, always presenting a balance and 
understanding to the various views and concerns incoming.

Repeating Paulo's words from last year: The chair is an administrative position 
that interfaces with the Apache Software Foundation Board, by submitting 
regular reports about project status and health. Read more about the PMC chair 
role on Apache projects:
- https://www.apache.org/foundation/how-it-works.html#pmc
- https://www.apache.org/foundation/how-it-works.html#pmc-chair
- https://www.apache.org/foundation/faq.html#why-are-PMC-chairs-officers

The PMC as a whole is the entity that oversees and leads the project and any 
PMC member can be approached as a representative of the committee. A list of 
Apache Cassandra PMC members can be found on: 
https://cassandra.apache.org/_/community.html



Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-03-24 Thread Yifan Cai
Hi Jeremiah,

There are good reasons to not have these inside Cassandra. Consider the
following.
- Resources isolation. Having the said service running within the same JVM
may negatively impact Cassandra storage's performance. It could be more
beneficial to have them in Sidecar, which offers strong resource isolation
guarantees.
- Availability. If the Cassandra cluster is being bounced, using sidecar
would not affect the SBR/SBW functionality, e.g. SBR can still read
SSTables via sidecar endpoints.
- Compatibility. Sidecar provides stable REST-based APIs, such as uploading
SSTables endpoint, which would remain compatible with different versions of
Cassandra. The current implementation supports versions 3.0 and 4.0.
- Complexity. Considering the existence of the Sidecar project, it would be
less complex to avoid adding another (http?) service in Cassandra.
- Release velocity. Sidecar, as an independent project, can have a quicker
release cycle from Cassandra.
- The features in sidecar are mostly implemented based on various existing
tools/APIs exposed from Cassandra, e.g. ring, commit sstable, snapshot, etc.

Regarding authentication and authorization
- We will add it as a follow-on CEP in Sidecar, but we don't want to hold
up this CEP. It would be a feature that benefits all Sidecar endpoints.

- Yifan

On Fri, Mar 24, 2023 at 2:43 PM Doug Rohrer  wrote:

> I agree that the analytics library will need to support vnodes. To be
> clear, there’s nothing preventing the solution from working with vnodes
> right now, and no assumptions about a 1:1 topology between a token and a
> node. However, we don’t, today, have the ability to test vnode support
> end-to-end. We are working towards that, however, and should be able to
> remove the caveat from the released analytics library once we can properly
> test vnode support.
> If it helps, I can update the CEP to say something more like “Caveat:
> Currently untested with vnodes - work is ongoing to remove this limitation”
> if that helps?
>
> Doug
>
> > On Mar 24, 2023, at 11:43 AM, Brandon Williams  wrote:
> >
> > On Fri, Mar 24, 2023 at 10:39 AM Jeremiah D Jordan
> >  wrote:
> >>
> >> I have concerns with the majority of this being in the sidecar and not
> in the database itself.  I think it would make sense for the server side of
> this to be a new service exposed by the database, not in the sidecar.  That
> way it can be able to properly integrate with the authentication and
> authorization apis, and to make it a first class citizen in terms of having
> unit/integration tests in the main DB ensuring no one breaks it.
> >
> > I don't think this can/should happen until it supports the database's
> > default configuration with vnodes.
>
>


Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-03-28 Thread Yifan Cai
A lot of great discussions!

On the sidecar front, especially what the role sidecar plays in terms of
this CEP, I feel there might be some confusion. Once the code is published,
we should have clarity.
Sidecar does not read sstables nor do any coordination for analytics
queries. It is local to the companion Cassandra instance. For bulk read, it
takes snapshots and streams sstables to spark workers to read. For bulk
write, it imports the sstables uploaded from spark workers. All commands
are existing jmx/nodetool functionalities from Cassandra. Sidecar adds the
http interface to them. It might be an over simplified description. The
complex computation is performed in spark clusters only.

In the long run, Cassandra might evolve into a database that does both OLTP
and OLAP. (Not what this thread aims for)
At the current stage, Spark is very suited for analytic purposes.

On Tue, Mar 28, 2023 at 9:06 AM Benedict  wrote:

> I disagree with the first claim, as the process has all the information it
> chooses to utilise about which resources it’s using and what it’s using
> those resources for.
>
> The inability to isolate GC domains is something we cannot address, but
> also probably not a problem if we were doing everything with memory
> management as well as we could be.
>
> But, not worth detailing this thread for. Today we do very little well on
> this front within the process, and a separate process is well justified
> given the state of play.
>
> On 28 Mar 2023, at 16:38, Derek Chen-Becker  wrote:
>
> 
>
> On Tue, Mar 28, 2023 at 9:03 AM Joseph Lynch 
> wrote:
> ...
>
> I think we might be underselling how valuable JVM isolation is,
>> especially for analytics queries that are going to pass the entire
>> dataset through heap somewhat constantly.
>>
>
> Big +1 here. The JVM simply does not have significant granularity of
> control for resource utilization, but this is explicitly a feature of
> separate processes. Add in being able to separate GC domains and you can
> avoid a lot of noisy neighbor in-VM behavior for the disparate workloads.
>
> Cheers,
>
> Derek
>
>
> --
> +---+
> | Derek Chen-Becker |
> | GPG Key available at https://keybase.io/dchenbecker and   |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---+
>
>


Re: [VOTE] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-05-04 Thread Yifan Cai
+1

From: Jon Haddad 
Sent: Thursday, May 4, 2023 3:31:52 PM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] CEP-28: Reading and Writing Cassandra Data with Spark Bulk 
Analytics

+1.

Awesome work Doug!  Great to see this moving forward.

On 2023/05/04 18:34:46 "C. Scott Andreas" wrote:
> +1nb.As someone familiar with this work, it's pretty hard to overstate the 
> impact it has on completing Cassandra's HTAP story. Eliminating the overhead 
> of bulk reads and writes on production OLTP clusters is transformative.– 
> ScottOn May 4, 2023, at 9:47 AM, Doug Rohrer  wrote:Hello 
> all,I’d like to put CEP-28 to a 
> vote.Proposal:https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-28%3A+Reading+and+Writing+Cassandra+Data+with+Spark+Bulk+AnalyticsJira:https://issues.apache.org/jira/browse/CASSANDRA-16222Draft
>  implementation:- Apache Cassandra Spark Analytics source code: 
> https://github.com/frankgh/cassandra-analytics- Changes required for Sidecar: 
> https://github.com/frankgh/cassandra-sidecar/tree/CEP-28-bulk-apisDiscussion:https://lists.apache.org/thread/lrww4d7cdxgtg8o3gt8b8foymzpvq7z3The
>  vote will be open for 72 hours. A vote passes if there are at least three 
> binding +1s and no binding vetoes. Thanks,Doug Rohrer


Re: [VOTE] Release dtest-api 0.0.14

2023-05-15 Thread Yifan Cai
+1

On Mon, May 15, 2023 at 3:13 PM Dinesh Joshi  wrote:

> Proposing the test build of in-jvm dtest API 0.0.14 for release.
>
> Repository:
> https://gitbox.apache.org/repos/asf?p=cassandra-in-jvm-dtest-api.git
>
> Candidate SHA:
>
> https://github.com/apache/cassandra-in-jvm-dtest-api/commit/ea4b44e0ed0a4f0bbe9b18fb40ad927b49a73a32
> tagged with 0.0.14
>
> Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1289/org/apache/cassandra/dtest-api/0.0.14/
>
> Key signature: 53371F9B1B425A336988B6A03B6042413D323470
>
> Changes since last release:
>
> * CASSANDRA-18511: Add support for JMX in jvm-dtest
>
> The vote will be open for 24 hours. Everyone who has tested the build
> is invited to vote. Votes by PMC members are considered binding. A
> vote passes if there are at least three binding +1s.
>


Re: [VOTE] CEP-30 ANN Vector Search

2023-05-25 Thread Yifan Cai
+1

From: Josh McKenzie 
Sent: Thursday, May 25, 2023 5:37:02 PM
To: dev 
Subject: Re: [VOTE] CEP-30 ANN Vector Search

+1

On Thu, May 25, 2023, at 8:33 PM, Jake Luciani wrote:
+1

On Thu, May 25, 2023 at 11:45 AM Jonathan Ellis 
mailto:jbel...@gmail.com>> wrote:
Let's make this official.

CEP: 
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes

POC that demonstrates all the big rocks, including distributed queries: 
https://github.com/datastax/cassandra/tree/cep-vsearch

--
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced
--
http://twitter.com/tjake


Re: [VOTE] CEP-8 Datastax Drivers Donation

2023-06-13 Thread Yifan Cai
+1

From: David Capwell 
Sent: Tuesday, June 13, 2023 8:37:10 AM
To: dev 
Subject: Re: [VOTE] CEP-8 Datastax Drivers Donation

+1

On Jun 13, 2023, at 7:59 AM, Josh McKenzie  wrote:

+1

On Tue, Jun 13, 2023, at 10:55 AM, Jeremiah Jordan wrote:
+1 nb

On Jun 13, 2023 at 9:14:35 AM, Jeremy Hanna 
mailto:jeremy.hanna1...@gmail.com>> wrote:

Calling for a vote on CEP-8 [1].

To clarify the intent, as Benjamin said in the discussion thread [2], the goal 
of this vote is simply to ensure that the community is in favor of the 
donation. Nothing more.
The plan is to introduce the drivers, one by one. Each driver donation will 
need to be accepted first by the PMC members, as it is the case for any 
donation. Therefore the PMC should have full control on the pace at which new 
drivers are accepted.

If this vote passes, we can start this process for the Java driver under the 
direction of the PMC.

Jeremy

1. 
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-8%3A+Datastax+Drivers+Donation
2. https://lists.apache.org/thread/opt630do09phh7hlt28odztxdv6g58dp



Re: [VOTE] CEP 33 - CIDR filtering authorizer

2023-06-27 Thread Yifan Cai
+1

On Tue, Jun 27, 2023 at 1:50 PM Dinesh Joshi  wrote:

> +1
>
>
> On Jun 27, 2023, at 1:23 PM, Josh McKenzie  wrote:
>
> 
> +1
>
> On Tue, Jun 27, 2023, at 1:17 PM, Shailaja Koppu wrote:
>
> Hi Team,
>
> (Starting a new thread for VOTE instead of reusing the DISCUSS thread, to
> follow usual procedure).
>
> Please vote on CEP 33 - CIDR filtering authorizer
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-33%3A+CIDR+filtering+authorizer
> 
> .
>
> Thanks,
> Shailaja
>
>
>


Re: Cassandra Sidecar CI is now green!

2023-07-20 Thread Yifan Cai
Thank you for fixing the build on ci-cassandra! I am glad that I can
contribute to the process :D

- Yifan

On Thu, Jul 20, 2023 at 4:00 PM Francisco Guerrero 
wrote:

> Hi list,
>
> I wanted to bring some visibility into the Cassandra Sidecar CI health [1].
> It seems like it has been broken for quite a while and we have finally
> fixed
> it today.
>
> Special thanks to Mick for noticing the issue and bringing it up to me.
> Also,
> thanks to Yifan and Dinesh for reviewing the PR [2] and helping me iterate
> over the PR.
>
> Best,
> - Francisco
>
> [1] https://ci-cassandra.apache.org/job/cassandra~sidecar/
> [2] https://issues.apache.org/jira/browse/CASSANDRASC-66
>


Re: [VOTE] CEP-34: mTLS based client and internode authenticators

2023-07-21 Thread Yifan Cai
+1

From: Dinesh Joshi 
Sent: Friday, July 21, 2023 12:23:30 PM
To: dev 
Subject: Re: [VOTE] CEP-34: mTLS based client and internode authenticators

+1

> On Jul 21, 2023, at 11:07 AM, Francisco Guerrero  wrote:
>
> +1 (nb). This is a very valuable enhancement for the project.
>
> Thanks for the contribution, Jyothsna!
>
> On 2023/07/21 16:57:45 Jyothsna Konisa wrote:
>> Hi Everyone!
>>
>> I would like to start a vote thread for CEP-34.
>>
>> Proposal:
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-34%3A+mTLS+based+client+and+internode+authenticators
>> JIRA   :
>> https://issues.apache.org/jira/browse/CASSANDRA-18554
>> Draft Implementation : https://github.com/apache/cassandra/pull/2372
>> Discussion :
>> https://lists.apache.org/thread/pnfg65r76rbbs70hwhsz94ds6yo2042f
>>
>> The vote will be open for 72 hours. A vote passes if there are at least 3
>> binding +1s and no binding vetoes.
>>
>> Thanks,
>> Jyothsna Konisa.
>>



Re: [VOTE] Release dtest-api 0.0.16

2023-08-19 Thread Yifan Cai
+1

From: C. Scott Andreas 
Sent: Saturday, August 19, 2023 9:51:16 AM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Release dtest-api 0.0.16

+1nb

On Aug 19, 2023, at 9:50 AM, Blake Eggleston  wrote:

+1

On Aug 17, 2023, at 12:37 AM, Alex Petrov  wrote:


+1

On Thu, Aug 17, 2023, at 4:46 AM, Brandon Williams wrote:
+1

Kind Regards,
Brandon

On Wed, Aug 16, 2023 at 4:34 PM Dinesh Joshi 
mailto:djo...@apache.org>> wrote:
>
> Proposing the test build of in-jvm dtest API 0.0.16 for release.
>
> Repository:
> https://gitbox.apache.org/repos/asf?p=cassandra-in-jvm-dtest-api.git
>
> Candidate SHA:
> https://github.com/apache/cassandra-in-jvm-dtest-api/commit/1ba6ef93d0721741b5f6d6d72cba3da03fe78438
> tagged with 0.0.16
>
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1307/org/apache/cassandra/dtest-api/0.0.16/
>
> Key signature: 53371F9B1B425A336988B6A03B6042413D323470
>
> Changes since last release:
>
> * CASSANDRA-18727 - JMXUtil.getJmxConnector should retry connection attempts
>
> The vote will be open for 24 hours. Everyone who has tested the build
> is invited to vote. Votes by PMC members are considered binding. A
> vote passes if there are at least three binding +1s.
>



Re: [VOTE] Accept java-driver

2023-10-03 Thread Yifan Cai
+1

From: David Capwell 
Sent: Tuesday, October 3, 2023 9:45:02 AM
To: dev 
Subject: Re: [VOTE] Accept java-driver

+1

On Oct 3, 2023, at 8:32 AM, Chris Lohfink  wrote:

+1

On Tue, Oct 3, 2023 at 10:30 AM Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:
+1


On Mon, Oct 2, 2023 at 9:53 PM Mick Semb Wever 
mailto:m...@apache.org>> wrote:
The donation of the java-driver is ready for its IP Clearance vote.
https://incubator.apache.org/ip-clearance/cassandra-java-driver.html

The SGA has been sent to the ASF.  This does not require acknowledgement before 
the vote.

Once the vote passes, and the SGA has been filed by the ASF Secretary, we will 
request ASF Infra to move the datastax/java-driver as-is to apache/java-driver

This means all branches and tags, with all their history, will be kept.  A 
cleaning effort has already cleaned up anything deemed not needed.

Background for the donation is found in CEP-8: 
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-8%3A+DataStax+Drivers+Donation

PMC members, please take note of (and check) the IP Clearance requirements when 
voting.

The vote will be open for 72 hours (or longer). Votes by PMC members are 
considered binding. A vote passes if there are at least three binding +1s and 
no -1's.

regards,
Mick



CASSANDRA-18941 produce size bounded SSTables from CQLSSTableWriter

2023-10-23 Thread Yifan Cai
Hi,

I want to propose merging the patch in CASSANDRA-18941 to 4.0 and up to
trunk and hope we are all OK with it.

In CASSANDRA-18941, I am adding the capability to produce size-bounded
SSTables in CQLSSTableWriter for sorted data. It can greatly benefit
Cassandra Analytics (https://github.com/apache/cassandra-analytics) for
bulk writing SSTables, since it avoids buffering and sorting on flush,
given the data source is sorted already in the bulk write process.
Cassandra Analytics supports Cassandra 4.0 and depends on the cassandra-all
4.0.x library. Therefore, we are mostly interested in using the new
capability in 4.0.

CQLSSTableWriter is only used in offline tools and never in the code path
of Cassandra server.

Any objections to merging the patch to 4.0 and up to trunk?

- Yifan


Re: CASSANDRA-18941 produce size bounded SSTables from CQLSSTableWriter

2023-10-25 Thread Yifan Cai
Thanks everyone! I have updated CASSANDRA-18941 with PRs to each branch,
i.e. cassandra-4.0, cassandra-4.1, cassandra-5.0 and cassandra-trunk.

- Yifan

On Wed, Oct 25, 2023 at 7:00 AM Doug Rohrer  wrote:

> +1 (nb) - wiłl be nice for the analytics writer to be able to size
> SSTables appropriately and efficiently.
>
> Doug
>
> On Oct 24, 2023, at 10:36 PM, guo Maxwell  wrote:
>
> 😄
>
> Chris Lohfink  于2023年10月25日周三 05:02写道:
>
>> +1
>>
>> On Tue, Oct 24, 2023 at 11:24 AM Brandon Williams 
>> wrote:
>>
>>> +1
>>>
>>> Kind Regards,
>>> Brandon
>>>
>>> On Mon, Oct 23, 2023 at 6:22 PM Yifan Cai  wrote:
>>> >
>>> > Hi,
>>> >
>>> > I want to propose merging the patch in CASSANDRA-18941 to 4.0 and up
>>> to trunk and hope we are all OK with it.
>>> >
>>> > In CASSANDRA-18941, I am adding the capability to produce size-bounded
>>> SSTables in CQLSSTableWriter for sorted data. It can greatly benefit
>>> Cassandra Analytics (https://github.com/apache/cassandra-analytics) for
>>> bulk writing SSTables, since it avoids buffering and sorting on flush,
>>> given the data source is sorted already in the bulk write process.
>>> Cassandra Analytics supports Cassandra 4.0 and depends on the cassandra-all
>>> 4.0.x library. Therefore, we are mostly interested in using the new
>>> capability in 4.0.
>>> >
>>> > CQLSSTableWriter is only used in offline tools and never in the code
>>> path of Cassandra server.
>>> >
>>> > Any objections to merging the patch to 4.0 and up to trunk?
>>> >
>>> > - Yifan
>>>
>>
>


Re: [DISCUSS] Harry in-tree

2023-11-27 Thread Yifan Cai
+1

发件人: Sam Tunnicliffe 
发送时间: Tuesday, November 28, 2023 2:43:51 AM
收件人: dev 
主题: Re: [DISCUSS] Harry in-tree

Definite +1 to bringing harry-core in tree.

On 24 Nov 2023, at 15:43, Alex Petrov  wrote:

Hi everyone,

With TCM landed, there will be way more Harry tests in-tree: we are using it 
for many coordination tests, and there's now a simulator test that uses Harry. 
During development, Harry has allowed us to uncover and resolve numerous 
elusive edge cases.

I had conversations with several folks, and wanted to propose to move 
harry-core to Cassandra test tree. This will substantially simplify/streamline 
co-development of Cassandra and Harry. With a new HistoryBuilder API that has 
helped to find and trigger [1] [2] and [3], it will also be much more 
approachable.

Besides making it easier for everyone to develop new fuzz tests, it will also 
substantially lower the barrier to entry. Currently, debugging an issue found 
by Harry involves a cumbersome process of rebuilding and transferring jars 
between Cassandra and Harry, depending on which side you modify. This not only 
hampers efficiency but also deters broader adoption. By merging harry-core into 
the Cassandra test tree, we eliminate this barrier.

Thank you,
--Alex

[1] https://issues.apache.org/jira/browse/CASSANDRA-19011
[2] https://issues.apache.org/jira/browse/CASSANDRA-18993
[3] https://issues.apache.org/jira/browse/CASSANDRA-18932



Re: Welcome Francisco Guerrero Hernandez as Cassandra Committer

2023-11-28 Thread Yifan Cai
Congratulations! It is well deserved.

发件人: C. Scott Andreas 
发送时间: Wednesday, November 29, 2023 2:56:34 AM
收件人: dev@cassandra.apache.org 
主题: Re: Welcome Francisco Guerrero Hernandez as Cassandra Committer

Congratulations, Francisco!

- Scott

> On Nov 28, 2023, at 10:53 AM, Dinesh Joshi  wrote:
>
> The PMC members are pleased to announce that Francisco Guerrero Hernandez 
> has accepted
> the invitation to become committer today.
>
> Congratulations and welcome!
>
> The Apache Cassandra PMC members


Re: Welcome Maxim Muzafarov as Cassandra Committer

2024-01-08 Thread Yifan Cai
Congrats!

From: David Capwell 
Sent: Monday, January 8, 2024 11:03:12 AM
To: dev 
Subject: Re: Welcome Maxim Muzafarov as Cassandra Committer

Congrats!

On Jan 8, 2024, at 10:53 AM, Jacek Lewandowski  
wrote:

Congratulations Maxim, well deserved, it's a pleasure to work with you!

- - -- --- -  -
Jacek Lewandowski


pon., 8 sty 2024 o 19:35 Lorina Poland 
mailto:polan...@apache.org>> napisał(a):
Congratulations Maxim!

On 2024/01/08 18:19:04 Josh McKenzie wrote:
> The Apache Cassandra PMC is pleased to announce that Maxim Muzafarov has 
> accepted
> the invitation to become a committer.
>
> Thanks for all the hard work and collaboration on the project thus far, and 
> we're all looking forward to working more with you in the future. 
> Congratulations and welcome!
>
> The Apache Cassandra PMC members
>
>



Re: Welcome Alexandre Dutra, Andrew Tolbert, Bret McGuire, Olivier Michallat as Cassandra Committers

2024-04-17 Thread Yifan Cai
Congrats all

From: Josh McKenzie 
Sent: Wednesday, April 17, 2024 11:05:29 AM
To: dev 
Subject: Re: Welcome Alexandre Dutra, Andrew Tolbert, Bret McGuire, Olivier 
Michallat as Cassandra Committers

Congrats everyone and thanks for all the hard work to get things to this point!

On Wed, Apr 17, 2024, at 1:18 PM, Ekaterina Dimitrova wrote:
Congrats and thank you for all your work on the drivers!

On Wed, 17 Apr 2024 at 13:17, Francisco Guerrero 
mailto:fran...@apache.org>> wrote:
Congratulations everyone!

On 2024/04/17 17:14:34 Abe Ratnofsky wrote:
> Congrats everyone!
>
> > On Apr 17, 2024, at 1:10 PM, Benjamin Lerer 
> > mailto:b.le...@gmail.com>> wrote:
> >
> > The Apache Cassandra PMC is pleased to announce that Alexandre Dutra, 
> > Andrew Tolbert, Bret McGuire and Olivier Michallat have accepted the 
> > invitation to become committers on the java driver sub-project.
> >
> > Thanks for your contributions to the Java driver during all those years!
> > Congratulations and welcome!
> >
> > The Apache Cassandra PMC members
>
>



Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Yifan Cai
Hi Stefan,

Thanks for putting the FQL example! However, it seems to be incorrect. FQL
only records the _successful_ queries. The query at T4 fails, and it will
not be included in FQL log.
I do agree that changing guardrails on the fly can cause confusion when FQL
is enabled on the node. Operator should probably avoid doing so. But it
seems unrelated with contraints. Besides, there are value size guardrails,
i.e. columnValueSize and collectionSize, available in Cassandra already.

On extensibility, I agree that the CEP should make it clear what
constraints are included and how they work. My understanding is that it
wants to have size check and value check, which are useful for most cases.

- Yifan

On Thu, Jun 6, 2024 at 9:25 AM Štefan Miklošovič <
stefan.mikloso...@gmail.com> wrote:

> Another problem with this constraints feature is that if it does not
> solely rely on constraints in CQL, then it would be non-deterministic if we
> want to replay all mutations from a fql log.
>
> Let's take this into consideration (T = time)
>
> T0 - a node is started with no guardrails set
> T1 - guardrail is set via JMX to not allow anything bigger than size of 10
> (whatever size means)
> T2 - a user creates a table with a constraint that anything bigger than
> size of 8 is forbidden
> T3 - a user inserts a mutation with size of 5
> T4 - a user modifies a table to set the constraint in such a way that
> anything bigger than size of 15 is forbidden - this will fail because we
> have a guardrail that anything bigger than 10 is forbidden from T1.
>
> Then we gather FQL log and restart the node, as guardrails do not survive
> restarts for now, when we replay, then T4 will be replayed too but it
> should not be.
>
> Is this correct?
>
> On Thu, Jun 6, 2024 at 9:49 AM Štefan Miklošovič <
> stefan.mikloso...@gmail.com> wrote:
>
>> I agree with Jon that a detailed description of all constraints to be
>> introduced is necessary. Only to say that it will be extensible so we can
>> add other constraints later is not enough. What other constraints?
>>
>> On Thu, Jun 6, 2024 at 6:24 AM Jon Haddad  wrote:
>>
>>> I think there's some promising ideas here, but the CEP needs to be
>>> developed a bit more.
>>>
>>> > Another types of constraints and functions can be added in the future
>>> to provide even more flexibility, but are out of the scope of this CEP.
>>>
>>> > For the third point, I didn’t want to be prescriptive on what those
>>> validations should be, but the fact that the proposal is extensible to
>>> those potential use cases is something concrete that, in my opinion, comes
>>> as a benefit of the actual proposal. I’d be happy to develop a bit more the
>>> main example used of sizeOf if it helps alleviate your concerns on this
>>> point.
>>>
>>> I disagree, quite strongly, with this.  While I appreciate
>>> extensibility, I think having a variety of actual constraints that ship
>>> with the feature means it needs to be built to satisfy real world use
>>> cases.  Without going through this process, it feels a bit too much like
>>> triggers, UDAs and UDFs  - incomplete, and too much left to the end user.
>>>
>>> To me, punting on thinking through constraints kicks the most important
>>> can down the road.
>>>
>>> Jon
>>>
>>>
>>> On Tue, Jun 4, 2024 at 5:37 PM Bernardo Botella <
>>> conta...@bernardobotella.com> wrote:
>>>
 In the CEP document there is another example (altho not explicetly
 mentioned) adding a constraint to the max value of an int ->
 `number_of_items int CONSTRAINT number_of_items < 1000`

 This basic example can also be used to expand on how to extend this
 functionality with these two initial constraints (size and value), by
 composing them to create new data types with proper validation.

 For example, this could create an ipv4 with built in validation:
 CREATE TYPE keyspace.cidr_address_ipv4 (
   ip_adress inet,
   subnet_mask int,
   CONSTRAINT subnet_mask > 0,
   CONSTRAINT subnet_mask < 32
 )

 Or a color type:
 CREATE TYPE keyspace.color (
   r int,
   g int,
   b int,
   CONSTRAINT r >= 0,
   CONSTRAINT r < 255,
   CONSTRAINT g >= 0,
   CONSTRAINT g < 255,
   CONSTRAINT b >= 0,
   CONSTRAINT b < 255,
 )


 Another types of constraints and functions can be added in the future
 to provide even more flexibility, but are out of the scope of this CEP.

 Bernardo

 On Jun 4, 2024, at 1:01 PM, Jon Haddad  wrote:

 The idea is interesting.  I think it would help to have more concrete
 examples.  It's a bit sparse at the moment, and I have a hard time getting
 on board with new features where the main selling point is Extensibility
 over the value they provide on their own.

 I think it would help a lot if we knew what types of constraints,
 besides the size check, you were thinking of adding.

 Jon

 On Mon, Jun 3, 202

Re: Suggestions for CASSANDRA-18078

2024-06-20 Thread Yifan Cai
I am voting against this for now.

There is an unaddressed gap between the functions. I do not believe there
is an equivalent replacement for the MAXWRITETIME function already, which
will disrupt its adopters.

MAXWRITETIME handles both single value columns and collections as input.
Meanwhile, COLLECTION_MAX(WRITETIME(..)) only applies to collections. There
is CASSANDRA-18085 aims to extend COLLECTION_MAX to non-collection values.
After merging CASSANDRA-18085, then we have a true replacement and can
remove MAXWRITETIME.

- Yifan

On Thu, Jun 20, 2024 at 10:32 AM Jon Haddad  wrote:

> Agreed. If we release it, we can’t remove it after. Option 2 is off the
> table.
>
> —
> Jon Haddad
> Rustyrazorblade Consulting
> rustyrazorblade.com
>
>
> On Thu, Jun 20, 2024 at 7:13 PM Jeff Jirsa  wrote:
>
>> If we have a public-facing API that we’re contemplating releasing to the
>> public, and we don’t think it’s needed, we should remove it before it’s
>> launched and we’re stuck with it forever.
>>
>>
>>
>>
>> On Jun 20, 2024, at 9:55 AM, Jeremiah Jordan 
>> wrote:
>>
>> +1 from me for 1, just remove it now.
>> I think this case is different from CASSANDRA-19556/CASSANDRA-17425.  The
>> new guardrail from 19556 which would deprecate the 17425 has not been
>> committed yet.  In the case of MAXWRITETIME the replacement is already in
>> the code, we just didn’t remove MAXWRITETIME yet.
>>
>> Jeremiah Jordan
>> e. jerem...@datastax.com
>> w. www.datastax.com
>>
>>
>>
>> On Jun 20, 2024 at 11:46:08 AM, Štefan Miklošovič 
>> wrote:
>>
>>> List,
>>>
>>> we need your opinions about CASSANDRA-18078.
>>>
>>> That ticket is about the removal of MAXWRITETIME function which was
>>> added in CASSANDRA-17425 and firstly introduced in 5.0-alpha1.
>>>
>>> This function was identified to be redundant in favor of CASSANDRA-8877
>>> and CASSANDRA-18060.
>>>
>>> The idea of the removal was welcomed and the patch was prepared doing so
>>> but it was never delivered and the question what to do with it, in
>>> connection with 5.0.0, still remains.
>>>
>>> The options are:
>>>
>>> 1) since 18078 was never released in GA, there is still time to remove
>>> it.
>>> 2) it is too late for the removal hence we would keep it in 5.0.0 and we
>>> would deprecate it in 5.0.1 and remove it in trunk.
>>>
>>> It is worth to say that there is a precedent in 2), in CASSANDRA-17495,
>>> where it was the very same scenario. A guardrail was introduced in alpha1.
>>> We decided to release and deprecate in 5.0.1 and remove in trunk. The same
>>> might be applied here, however we would like to have it confirmed if this
>>> is indeed the case or we prefer to just go with 1) and be done with it.
>>>
>>> Regards
>>>
>>
>>


Re: Cassandra PMC Chair Rotation, 2024 Edition

2024-06-20 Thread Yifan Cai
Thank you for the service, Josh!
Congrats, Dinesh!

On Thu, Jun 20, 2024 at 11:32 AM Jean-Armel Luce  wrote:

> Josh, thanks for the job
> Dinesh, congrats!!
>
> Le jeu. 20 juin 2024 à 19:42, David Capwell  a écrit :
>
>> Congrats!
>>
>> On Jun 20, 2024, at 9:10 AM, Melissa Logan  wrote:
>>
>> Josh, thank you for your time as chair + congrats Dinesh!
>>
>> On Thu, Jun 20, 2024 at 9:08 AM Abe Ratnofsky  wrote:
>>
>>> Congrats Dinesh! Thank you Josh!
>>>
>>> On Jun 20, 2024, at 11:53 AM, Jeremiah Jordan 
>>> wrote:
>>>
>>> Welcome to the Chair role Dinesh!  Congrats!
>>>
>>> On Jun 20, 2024 at 10:50:37 AM, Josh McKenzie 
>>> wrote:
>>>
 Another PMC Chair baton pass incoming! On behalf of the Apache
 Cassandra Project Management Committee (PMC) I would like to welcome and
 congratulate our next PMC Chair Dinesh Joshi (djoshi).

 Dinesh has been a member of the PMC for a few years now and many of you
 likely know him from his thoughtful, measured presence on many of our
 collective discussions as we've grown and evolved over the past few years.

 I appreciate the project trusting me as liaison with the board over the
 past year and look forward to supporting Dinesh in the role in the future.

 Repeating Mick (repeating Paulo's) words from last year: The chair is
 an administrative position that interfaces with the Apache Software
 Foundation Board, by submitting regular reports about project status and
 health. Read more about the PMC chair role on Apache projects:
 - https://www.apache.org/foundation/how-it-works.html#pmc
 - https://www.apache.org/foundation/how-it-works.html#pmc-chair
 -
 https://www.apache.org/foundation/faq.html#why-are-PMC-chairs-officers

 The PMC as a whole is the entity that oversees and leads the project
 and any PMC member can be approached as a representative of the committee.
 A list of Apache Cassandra PMC members can be found on:
 https://cassandra.apache.org/_/community.html

>>>
>>>
>>


Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Yifan Cai
>
> - Alter and Drop constraints are as follows
> ALTER CONSTRAINT [name] CHECK new_condition DROP CONSTRAINT [name]
>

I think you mean the following syntax to modify existing constraints, since
constraints are part of the table definition.
ALTER TABLE [keyspace_name.]table_name ALTER CONSTRAINT [constraint_name]
CHECK check_expression

Dinesh's proposal to check on read is a good addition. I think it is
*optional* and should be enabled/disabled w/ configuration. The extra check
may not be desirable in some circumstances, e.g. the use cases do not ever
change the constraints and do not have other write data other than CQL.
Since the original CEP defines that the constraints are applied at the
write time, we need to update the CEP if we decide to include the check on
read.

- Yifan


On Tue, Jun 25, 2024 at 1:13 PM Štefan Miklošovič 
wrote:

> I wonder how often it is that users will apply the constraints on tables
> with data while they know their data is probably not compliant with the
> constraint configuration. I humbly think that people are aware of this in
> advance and what usually happens is that there is some kind of a job which
> consolidates the data (or migrates them to a new table) before admins put a
> "lid" on that so moving forward nobody puts there anything which would
> violate it.
>
> I probably have not kept myself up to date with the discussion but I was
> thinking that constraints are effectively there just on the write path.
> Whatever is read is not a job of a constraint to refuse to return.
>
> On Tue, Jun 25, 2024 at 9:57 PM Dinesh Joshi  wrote:
>
>> Abe, that's a good point. We need to call out distinct use-cases here.
>> When a fresh cluster is set up with constraints we don't have any issues
>> because the data written and read back is going to be compliant to the
>> constraint(s). For existing data in a cluster where new constraints are
>> applied or existing constraints changed in such a way that may render
>> existing data unreadable, we need a good user experience. This is what I
>> propose –
>>
>> 1. When a constraint is added or changed in such a way that existing data
>> could be rendered unreadable, we should warn the user.
>>
>> 2. Give the user a choice of whether it is ok for the data to be rendered
>> unreadable and an error is issued or a warning should be issued when the
>> read violates the constraint but data is still readable. New data going in
>> will meet the constraint but old data would need to be rewritten for
>> the application to make it compliant.
>>
>> With this approach the application developer can decide what is right for
>> their particular use-case. In many cases the application developer may
>> decide to rewrite the data when they see a warning.
>>
>>
>> On Tue, Jun 25, 2024 at 12:46 PM Abe Ratnofsky  wrote:
>>
>>> If we're going to introduce a feature that looks like SQL constraints,
>>> we should make sure it's "reasonably" compliant. In particular, we should
>>> avoid situations where a user creates a constraint, writes some data, then
>>> reads data that violates that constraint, unless they've expressed that
>>> violations on read would be acceptable.
>>>
>>> For Postgres, when adding a new constraint you can specify NOT VALID to
>>> avoid scanning all existing relevant data[1]. If we want to avoid
>>> scan-on-DDL, this tradeoff needs to be made clear to a user.
>>>
>>> As we've already discussed, constraints must deal with operations that
>>> appear within limits on the write path, but once reconciled on read or
>>> during compaction can lead to a violation. Adding to non-frozen collections
>>> is one example. Expecting users to understand the write path for
>>> collections feels unrealistic to me; I wonder if we should express in the
>>> constraint itself that it only applies during write.
>>>
>>> Anything that uses "nodetool import" (including cassandra-analytics)
>>> could theoretically push constraint-violating mutations to a table. We
>>> could update import to scan table contents first, or add a flag to trust
>>> the data in imported SSTables and make cassandra-analytics executors aware
>>> of table-level constraints.
>>>
>>> Some client implementations read the system_schema tables to build their
>>> object mappers, I'd like to confirm that nothing will require clients to be
>>> aware of these new schema constructs.
>>>
>>> Overall, I'm supportive of the distinctions discussed between
>>> constraints and guardrails and like the direction this is heading; I'd just
>>> like to make sure the more detailed semantics aren't confusing or
>>> misleading for our users, and semantics are much harder to change in the
>>> future.
>>>
>>> [1]: https://www.postgresql.org/docs/current/sql-altertable.html
>>>
>>>


Re: [VOTE] CEP-42: Constraints Framework

2024-07-02 Thread Yifan Cai
+1 on CEP-42.

- Yifan

On Tue, Jul 2, 2024 at 5:17 AM Jon Haddad  wrote:

> +1
>
> On Tue, Jul 2, 2024 at 5:06 AM  wrote:
>
>> +1
>>
>>
>> On Jul 1, 2024, at 8:34 PM, Doug Rohrer  wrote:
>>
>> +1 (nb) - Thanks for all of the suggestions and Bernardo for wrangling
>> the CEP into shape!
>>
>> Doug
>>
>> On Jul 1, 2024, at 3:06 PM, Dinesh Joshi  wrote:
>>
>> +1
>>
>> On Mon, Jul 1, 2024 at 11:58 AM Ariel Weisberg  wrote:
>>
>>> Hi,
>>>
>>> I am +1 on CEP-42 with the latest updates to the CEP to clarify syntax,
>>> error messages, constraint naming and generated naming, alter/drop,
>>> describe etc.
>>>
>>> I think this now tracks very closely to how other SQL databases define
>>> constraints and the syntax is easily extensible to multi-column and
>>> multi-table constraints.
>>>
>>> Ariel
>>>
>>> On Mon, Jul 1, 2024, at 9:48 AM, Bernardo Botella wrote:
>>>
>>> With all the feedback that came in the discussion thread after the call
>>> for votes, I’d like to extend the period another 72 hours starting today.
>>>
>>> As before, a vote passes if there are at least 3 binding +1s and no
>>> binding vetoes.
>>>
>>> Thanks,
>>> Bernardo Botella
>>>
>>> On Jun 24, 2024, at 7:17 AM, Bernardo Botella <
>>> conta...@bernardobotella.com> wrote:
>>>
>>> Hi everyone,
>>>
>>> I would like to start the voting for CEP-42.
>>>
>>> Proposal:
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-42%3A+Constraints+Framework
>>> Discussion:
>>> https://lists.apache.org/thread/xc2phmxgsc7t3y9b23079vbflrhyyywj
>>>
>>> The vote will be open for 72 hours. A vote passes if there are at least
>>> 3 binding +1s and no binding vetoes.
>>>
>>> Thanks,
>>> Bernardo Botella
>>>
>>>
>>>
>>
>>


Re: Contributing cassandra-diff

2019-08-22 Thread Yifan Cai
Great addition in the tool set!

A separate repo would be better.

Grouping repos together only to be easier indexed does not seems to be a strong 
supportive reason. Just my 2 cents.

- Yifan

- Yifan


From: Dinesh Joshi 
Sent: Thursday, August 22, 2019 11:42 AM
To: dev
Subject: Re: Contributing cassandra-diff

+1 on a discrete repo.

Dinesh

> On Aug 22, 2019, at 9:14 AM, Michael Shuler  wrote:
>
> CI git polling for changes on a separate repository (if/when CI is needed) is 
> probably a better way to go. I don't believe there are any issues with INFRA 
> on us having discrete repos, and creating them with the self-help web tool is 
> quick and easy.
>
> Thanks for the neat looking utility!
>
> Michael
>
> On 8/22/19 10:33 AM, Sankalp Kohli wrote:
>> A different repo will be better
>>> On Aug 22, 2019, at 6:16 AM, Per Otterström  
>>> wrote:
>>>
>>> Very powerful tool indeed, thanks for sharing!
>>>
>>> I believe it is best to keep tools like this in different repos since 
>>> different tools will probably have different life cycles and tool chains. 
>>> Yes, that could be handled in a single repo, but with different repos we'd 
>>> get natural boundaries.
>>>
>>> -Original Message-
>>> From: Sumanth Pasupuleti 
>>> Sent: den 22 augusti 2019 14:40
>>> To: dev@cassandra.apache.org
>>> Subject: Re: Contributing cassandra-diff
>>>
>>> No hard preference on the repo, but just excited about this tool! Looking 
>>> forward to employing this for upgrade testing (very timely :))
>>>
 On Thu, Aug 22, 2019 at 3:38 AM Sam Tunnicliffe  wrote:

 My own weak preference would be for a dedicated repo in the first
 instance. If/when additional tools are contributed we should look at
 co-locating common stuff, but rushing toward a monorepo would be a
 mistake IMO.

>> On 22 Aug 2019, at 11:10, Jeff Jirsa  wrote:
>
> I weakly prefer contrib.
>
>
> On Thu, Aug 22, 2019 at 12:09 PM Marcus Eriksson
> 
 wrote:
>
>> Hi, we are about to open source our tooling for comparing two
>> cassandra clusters and want to get some feedback where to push it.
>> I think the options are: (name bike-shedding welcome)
>>
>> 1. create repos/asf/cassandra-diff.git 2. create a generic
>> repos/asf/cassandra-contrib.git where we can add
 more
>> contributed tools in the future
>>
>> Temporary location:
>> https://protect2.fireeye.com/url?k=e8982d07-b412e678-e8986d9c-86717
>> 581b0b5-292bc820a13b7138&q=1&u=https%3A%2F%2Fgithub.com%2Fkrummas%2
>> Fcassandra-diff
>>
>> Cassandra-diff is a spark job that compares the data in two
>> clusters -
 it
>> pages through all partitions and reads all rows for those
>> partitions in both clusters to make sure they are identical. Based
>> on the
 configuration
>> variable “reverse_read_probability” the rows are either read
>> forward or
 in
>> reverse order.
>>
>> Our main use case for cassandra-diff has been to set up two
>> identical clusters, transfer a snapshot from the cluster we want to
>> test to these clusters and upgrade one side. When that is done we
>> run this tool to
 make
>> sure that 2.1 and 3.0 gives the same results. A few examples of the
 bugs we
>> have found using this tool:
>>
>> * CASSANDRA-14823: Legacy sstables with range tombstones spanning
 multiple
>> index blocks create invalid bound sequences on 3.0+
>> * CASSANDRA-14803: Rows that cross index block boundaries can cause
>> incomplete reverse reads in some cases
>> * CASSANDRA-15178: Skipping illegal legacy cells can break reverse
>> iteration of indexed partitions
>>
>> /Marcus
>>
>> ---
>> -- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>


 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org


>>> B‹CB•È[œÝXœØÜšX™KK[XZ[ˆ]‹][œÝXœØÜšX™PØ\ÜØ[™˜K˜\XÚK›Ü™ÃB‘›ÜˆY][Û˜[ÛÛ[X[™ËK[XZ[ˆ]‹Z[Ø\ÜØ[™˜K˜\XÚK›Ü™ÃBƒB
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassand

Re: [DISCUSS] Switch to using GitHub pull requests?

2020-01-22 Thread Yifan Cai
+1 nb to the PR approach for reviewing.


And thanks David for initiating the discussion. I would like to put my 2
cents in it.


IMO, reviews comments are better associated with the changes, precisely to
the line level, if they are put in the PR rather than in the JIRA comments.
Discussions regarding each review comments are naturally organized into
this own dedicated thread. I agree that JIRA comments are more suitable for
high-level discussion regarding the design. But review comments in PR can
do a better job at code-level discussion.


Another benefit is to relief reviewers’ work. In the PR approach, we can
leverage the PR build step to perform an initial qualification. The actual
review can be deferred until the PR build passes. So reviewers are sure
that the change is good at certain level, i.e. it builds and the tests can
pass. Right now, contributors volunteer for providing the link to CI test
(however, one still needs to open the link to see the result).

On Wed, Jan 22, 2020 at 3:16 PM David Capwell  wrote:

> Thanks for the links Benedict!
>
> Been reading the links and see the following points being made
>
> *) enabling the spark process would lower the process to enter the project
> *) high level discussions should be in JIRA [1]
> *) not desirable to annotation JIRA and Github; should only annotate JIRA
> (reviewer, labels, etc.)
> *) given the multi branch nature, pull requires are not intuitive [2]
> *) merging is problematic and should keep the current merge process
> *) commits@ is not usable with PRs
> *) commits@ is better because of PRs
> *) people are more willing to nit-pick with PRs, less likely with current
> process [3]
> *) opens potential to "prevent commits that don't pass the tests" [4]
> *) prefer the current process
> http://cassandra.apache.org/doc/latest/development/patches.html [5]
> *) current process is annoying since you have to take the link in github
> and attach to JIRA for each comment in review
> *) missed notifications, more trust in commits@
> *) if someone rewrites history, comments could be hard to see
> *) its better to leave comments in the source code so people don't need to
> lookup github
>
> Here is how i see some of the points
>
> 1) I agree with the point that the high level discussions should be in
> JIRA; PRs are better at specific review and offer no real benefit over JIRA
> for larger structural changes
> 2) there are different patterns with multiple branches as well, but some of
> it is possible to codify and include in CI.  For example, you could take
> the diff, attempt to apply to 2.2 (maybe if [dtest] in commit?) and forward
> merge; of any conflicts are found, could annotate JIRA that the change is
> complex and may be best to submit multiple PRs.  Assuming we want something
> like this, it is also possible to run the tests against those branches as
> well.  I am not saying we do this, but saying that it is possible to
> improve or solve this problem, so doesn't appear a blocker to me.
> 3) by marking it easier to comment i can definitely see this happen, but
> don't see this as a reason not to.  I find that you are more willing to
> actually talk about small sections of the code in PR than in other forms
> and that its easier to track.  One of the things i see now is that the
> conversation moves to slack, so is it better not happening, happening in
> slack, or happening in github?
> 4) This is actually why i started this thread.  I created a patch a while
> back that passed review, got merged, and has been failing the build ever
> since.  I would like to make it more clear that code is likely to do this
> or not.
> 5) The link documents the process as submitting patches generate by "git
> format-patch", which i was told not to do my first patch
>
> Think i summarized all I saw.
>
> On Wed, Jan 22, 2020 at 2:30 PM Dinesh Joshi  wrote:
>
> > I personally use Github PRs to discuss the changes if there is feedback
> on
> > the code. The discussion does get linked with the JIRA ticket. However,
> > committing is manual.
> >
> > Dinesh
> >
> > > On Jan 22, 2020, at 2:20 PM, David Capwell  wrote:
> > >
> > > When submitting or reviewing a change in JIRA I notice that we have
> three
> > > main patterns for doing this: link branch, link diff, and link GitHub
> > pull
> > > request (PR); I wanted to bring up the idea of switching over to GitHub
> > > pull requests as the norm.
> > >
> > >
> > > Why should we do this?  The main reasons I can think of are:
> consistency
> > > within the project, common pattern outside and inside Apache (not a new
> > > process for new members to learn),
> > >
> > > PRs are easier to review and comment on (much easier than linking lines
> > in
> > > a branch), Github and JIRA integration is already present so all
> > > conversations will be added to the JIRA work log, and could be linked
> > with
> > > Jenkins to trigger builds and tests and to report the status into JIRA.
> > >
> > >
> > > How would one start to do t

Re: [DISCUSS] Client protocol changes (Was: 20200217 4.0 Status Update)

2020-02-18 Thread Yifan Cai
CustomPayload should be used to provide customization via a custom query
handler (that is outside of Cassandra source).
Supporting custom timeout per query is a new feature. It is more clear to
assign a dedicated query flag. In V5, the available number of query flags
expanded from 8 (in V4 and prior) to 32. We do not need to piggyback the
custom payload. Using a dedicated flag gives the same forward and backward
compatibility as custom payload.

On Tue, Feb 18, 2020 at 2:28 PM David Capwell  wrote:

> Given the JIRA in question, if you want to override the timeout to lower
> it, then the worst case if not supported yet is that you get the default
> timeout.  So this then makes me wonder "is there a way to add metadata to a
> message which is ignored if unknown" (aka forward compatibility).  Skimming
> the frame code i see we have
>
> boolean isCustomPayload =
> frame.header.flags.contains(Frame.Header.Flag.CUSTOM_PAYLOAD);
> boolean hasWarning =
> frame.header.flags.contains(Frame.Header.Flag.WARNING);
>
> UUID tracingId = isRequest || !isTracing ? null :
> CBUtil.readUUID(frame.body);
> List warnings = isRequest || !hasWarning ? null :
> CBUtil.readStringList(frame.body);
> Map customPayload = !isCustomPayload ? null :
> CBUtil.readBytesMap(frame.body);
>
> This makes me wonder if we could picky back off that for new features, that
> way older servers just ignore them. I have no idea of the negatives of
> customPayload (other than strings are more bytes for messages, evolution
> may be based off key names so annoying, etc.), but tags which are ignored
> sounds promising
>
>
> On Tue, Feb 18, 2020 at 1:53 PM Jeff Jirsa  wrote:
>
> > A few notes:
> >
> > - Protocol changes add work to the rest of the ecosystem. Drivers have to
> > update, etc.
> > - Nobody expects protocol changes in minors, though it's less of a
> concern
> > if we don't deprecate out the older version. E.g. if 4.0 launches with
> > protocol v4 and protocol v5, and then 4.0.2 adds protocol v6, do we
> > deprecate out v4? If yes, you potentially break clients that only
> supported
> > v3 and v4 in a minor version upgrade, which is unexpected. If not, how
> many
> > protocol versions are you willing to support at any given time?
> > - Having protocol changes introduces risk. Paging behavior across
> protocol
> > versions is the site of a number of different bugs recently.
> >
> >
> > On Tue, Feb 18, 2020 at 1:46 PM Tolbert, Andrew 
> > wrote:
> >
> > > I don't know the technical answer, but I suspect two motivations for
> > > doing new protocol versions in major releases could include:
> > >
> > > * protocol changes can be tied to feature changes that typically come
> > > in a major release.
> > > * protocol changes should be as infrequent as major releases.  Each
> > > new protocol version is another thing in the test matrix that needs to
> > > be tested.
> > >
> > > That last point can make it hard to get new changes in. If something
> > > doesn't make the upcoming protocol version, it might be years before
> > > another one, but I also think it's worth it to do this infrequently as
> > > it makes maintaining client and server code easier if there are less
> > > protocol versions to worry about.
> > >
> > > On the client-side, libraries themselves should be avoiding making
> > > Cassandra version checks when detecting capabilities.  There are a few
> > > exceptions, such as system table parsing for schema & peers,
> > > but those aren't related to the protocol.
> > >
> > > Thanks,
> > > Andy
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Feb 18, 2020 at 1:22 PM Nate McCall 
> wrote:
> > > >
> > > > [Moving to new message thread]
> > > >
> > > > Thanks for bringing this up, Jordan.
> > > >
> > > > IIRC, this was more a convention than a technical reason. Though I
> > could
> > > be
> > > > completely misremembering this.
> > > >
> > > > -- Forwarded message -
> > > > From: Jordan West 
> > > > Date: Wed, Feb 19, 2020 at 10:13 AM
> > > > Subject: Re: 20200217 4.0 Status Update
> > > > To: 
> > > >
> > > >
> > > > On Mon, Feb 17, 2020 at 12:52 PM Jeff Jirsa 
> wrote:
> > > >
> > > > >
> > > > > beyond the client proto change being painful for anything other
> than
> > > major
> > > > > releases
> > > > >
> > > > >
> > > > This came up during the community meeting today and I wanted to
> bring a
> > > > question about it to the list: could someone who is *very* familiar
> > with
> > > > the client proto share w/ the list why changing the proto in anything
> > > other
> > > > than a major release is so difficult? I hear this a lot and it seems
> to
> > > be
> > > > fact. So that all of us don't have to go read the code, a brief
> summary
> > > > would be super helpful. Or if there is a ticket that already covers
> > this
> > > > even better! I'd also be curious if there have ever been any thoughts
> > to
> > > > address it as it seems to be a consistent hurdle during the release
> > cycle
> > > > and one that tends to further incre

Re: Keeping test-only changes out of CHANGES.txt

2020-04-08 Thread Yifan Cai
+1


From: Jasonstack Zhao Yang 
Sent: Wednesday, April 8, 2020 9:04:51 AM
To: dev@cassandra.apache.org 
Subject: Re: Keeping test-only changes out of CHANGES.txt

+1

On Thu, Apr 9, 2020, 00:04 Aleksey Yeshchenko 
wrote:

> +1
>
> > On 8 Apr 2020, at 15:08, Mick Semb Wever  wrote:
> >
> > Can we agree on keeping such test changes out of CHANGES.txt ?
> >
> > We already don't put entries into CHANGES.txt if it is not a change
> > from any previous release.
> >
> > There was some discussion before¹ about this, and the problem that
> > being selective meant what ended up there being arbitrary. I think
> > this can be solved with an easy rule of thumb that if it only touches
> > *Test.java classes, or it is only about fixing a test, then it
> > shouldn't be in CHANGES.txt. That means if the patch does touch any
> > runtime code then you do still need to add an entry to CHANGES.txt.
> > This avoids the whole "arbitrary" problem,  and maintains CHANGES.txt
> > as user-facing formatted text to be searched through.
> >
> > If there's agreement I can commit to going through 4.0 changes and
> > removing those that never touched runtime code.
> >
> > regards,
> > Mick
> >
> > ¹)
> https://lists.apache.org/thread.html/a94946887081d8a408dd5cd01a203664f4d0197df713f0c63364a811%40%3Cdev.cassandra.apache.org%3E
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Simplify voting rules for in-jvm-dtest-api releases

2020-04-15 Thread Yifan Cai
+1


From: Sam Tunnicliffe 
Sent: Wednesday, April 15, 2020 7:49:50 AM
To: dev@cassandra.apache.org 
Subject: Re: Simplify voting rules for in-jvm-dtest-api releases

+1

> On 15 Apr 2020, at 14:35, Oleksandr Petrov  wrote:
>
> Hi everyone,
>
> Apache release rules were made for first-class projects. I would like to
> propose simplifying voting rules for in-jvm-dtest-api project [1].
>
> A bit of background: in-jvm-dtest-api is a project that is used by all
> active Cassandra branches (2.2, 3.0, 3.11, and trunk) to unify interfaces
> that allows creating clusters and running tests, much like Python dtests,
> just with a potential to run and develop them faster. Previously, anyone
> could break API compatibility by committing to only one of the branches and
> not updating the other one, which has happened on several occasions and has
> went unnoticed, and has added work for people who had to bring changes to
> more than one branch. This unified API and tests are particularly useful
> for upgrade tests, but are also good for any kind of testing.
>
> Since this project was made to simplify contributions to in-jvm dtests,
> it'd be great if making changes to this project would actually be simple.
> Before that, in order to make changes in in-jvm-dtest API, we required
> only +1 from a contributor and a committer could just commit the change.
>
> I would say that in order to cut a (minor) release of in-jvm-dtest-api you
> should:
>
> 1. Get a +1 from a contributor who can review and test your change
> 2. Get a +1 from one of committers who are familiar with in-jvm dtests (we
> have enough, I just don't want to volunteer anyone)
>
> This will guard us from unnecessary changes, and add an extra pair of eyes
> for things that influence moore than one branch, but leave us flexible
> enough to be able to move forward without conducting a vote.
>
> Since in-jvm-dtest-api is only used to test Cassandra, and isn't intended
> for production purposes, this is a low-risk change in procedure. Moreover,
> even if we package in-jvm-dtest-api with some Cassandra release, there will
> be an additional vote on the release, where anyone who has concerns about
> in-jvm-dtest-api changes can still voice them.
>
> Please let me know if you'd like more information about in-jvm-dtest API,
> or have comments about this change in procedure.
>
> Thank you,
> -- Alex
> [1] https://github.com/apache/cassandra-in-jvm-dtest-api


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [GitHub] [cassandra-diff] yifan-c opened a new pull request #8: Support running diff on multiple keyspaces

2020-05-17 Thread Yifan Cai
 +1

Thank you Mick for the notification fix.

- Yifan
On May 16, 2020, 1:29 AM -0700, Mick Semb Wever , wrote:


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.



Yifan, fix for notification to go to pr@ instead of dev@ is here:
https://github.com/apache/cassandra-diff/compare/master...thelastpickle:mck/add-asf-yml

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org


Re: [VOTE] Project governance wiki doc

2020-06-17 Thread Yifan Cai
+1 nb

From: Jon Haddad 
Sent: Wednesday, June 17, 2020 2:13 PM
To: dev@cassandra.apache.org
Subject: Re: [VOTE] Project governance wiki doc

Yes, this is my understanding as well.


On Wed, Jun 17, 2020 at 2:10 PM Benedict Elliott Smith 
wrote:

> I personally think we should not revisit the super-majority of votes
> decision, as that was settled already; simple-majority came a distant
> third.  Since this question doesn't really invalidate that decision, I
> think for forward progress it's better to simply address the vote floor,
> but just my 2c.
>
> On 17/06/2020, 21:58, "Jon Haddad"  wrote:
>
> For what it's worth, I thought Benedict's suggestion was a pretty
> reasonable one and am in favor of it.
>
> On Wed, Jun 17, 2020 at 1:40 PM Joshua McKenzie 
> wrote:
>
> > Race condition on that last one Benedict.
> >
> > What about using the quorum from roll call to simply define how many
> +1's
> > are needed to pass something? Simple majority of the roll call,
> simple
> > majority of total participants on specific vote and it passes?
> >
> > For example:
> >
> >- 33 pmc members
> >- 16 roll call
> >- 9 +1's required. If only participation is 9 vote with +1, passes
> >- If 9 +1's and 10 -1's, does not pass
> >
> > That prevents the "abstain to keep vote invalid" while keeping with
> the
> > lazy consensus spirit and requiring enough participation that a vote
> should
> > reasonably be considered indicative. Does raise the bar a bit from
> "simple
> > majority of this many votes required" to "this many +1's required",
> but
> > hopefully people responding to a roll call actually plan on showing
> up. We
> > could also open votes with "this many +1's required to pass" which
> might
> > further encourage participation.
> >
> >
> > On Wed, Jun 17, 2020 at 2:24 PM Joshua McKenzie <
> jmcken...@apache.org>
> > wrote:
> >
> >> I don't see anybody advocating for the low watermark where it
> stands.
> >> I'm +1 on the "simple majority of roll call + supermajority of that"
> >> revision, and no real harm in re-calling a vote today vs.
> yesterday; one
> >> day delay to clean this up now doesn't seem too much an imposition.
> >>
> >> @Jonathan Haddad  - want to revise the wiki
> article
> >> and call a new vote?
> >>
> >>
> >> On Wed, Jun 17, 2020 at 2:13 PM Jon Haddad 
> wrote:
> >>
> >>> Sorry, I was a bit vague there.
> >>>
> >>> I'm in favor of changing the minimum number of votes to be a simple
> >>> majority of the number of people participating in the roll call.
> For
> >>> example, if we have a roll call of 21, then we'll need a minimum
> of 11
> >>> binding votes participating.  Of that 11, we'd need 2/3 to be +1
> to pass,
> >>> so in that case 8 +1's.
> >>>
> >>> Regarding a new vote, I am personally in favor of that, yes.
> >>>
> >>>
> >>> On Wed, Jun 17, 2020 at 10:36 AM Brandon Williams <
> dri...@gmail.com>
> >>> wrote:
> >>>
> >>> > So with that (the -1), are you in favor of changing to simple
> majority
> >>> > (I am) and calling a new vote?
> >>> >
> >>> > On Wed, Jun 17, 2020 at 12:30 PM Jon Haddad 
> wrote:
> >>> > >
> >>> > > > I'm not concerned today, no, just musing and pointing out
> that
> >>> there
> >>> > are
> >>> > > easy ways to improve progress if we find there's an
> impediment.  I
> >>> don't
> >>> > > think it necessarily indicates bad intent to use voting rules
> as
> >>> > > formulated, either, for the record.
> >>> > >
> >>> > > Yeah, I didn't think you were serious about it being a
> problem, just
> >>> > wanted
> >>> > > to check.
> >>> > >
> >>> > > I'm changing my vote to a -1, in favor of a simple majority as
> the
> >>> low
> >>> > > watermark in vote participation (not approval).
> >>> > >
> >>> > > On Wed, Jun 17, 2020 at 9:56 AM Benedict Elliott Smith <
> >>> > bened...@apache.org>
> >>> > > wrote:
> >>> > >
> >>> > > > I'm not concerned today, no, just musing and pointing out
> that
> >>> there
> >>> > are
> >>> > > > easy ways to improve progress if we find there's an
> impediment.  I
> >>> > don't
> >>> > > > think it necessarily indicates bad intent to use voting
> rules as
> >>> > > > formulated, either, for the record.
> >>> > > >
> >>> > > > I do think redefining the roll call low watermark would be a
> good
> >>> > thing to
> >>> > > > do though.  It was a mistake to bring this to a vote without
> >>> discussing
> >>> > > > it.  Sorry for my part in forgetting the comment hadn't been
> >>> responded
> >>> > to,
> >>> > > > and also for the initial issue with formulation - it stemmed
> from
> >>> > poorly
> >>> > > > specifying the use 

Re: [VOTE] Project governance wiki doc (take 2)

2020-06-20 Thread Yifan Cai
+1 nb


From: Scott Andreas 
Sent: Saturday, June 20, 2020 11:00:15 AM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Project governance wiki doc (take 2)

+1 nb

> On Jun 20, 2020, at 9:37 AM, Joshua McKenzie  wrote:
>
> +1 (binding / present / active)
>
> On Sat, Jun 20, 2020 at 12:23 PM Ekaterina Dimitrova 
> wrote:
>
>> +1(non-binding)
>>
>> On Sat, 20 Jun 2020 at 11:38, Brandon Williams  wrote:
>>
>>> +1
>>>
>>> On Sat, Jun 20, 2020, 10:12 AM Joshua McKenzie 
>>> wrote:
>>>
 Link to doc:


>>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Project+Governance

 Change since previous cancelled vote:
 "A simple majority of this electorate becomes the low-watermark for
>> votes
 in favour necessary to pass a motion, with new PMC members added to the
 calculation."

 This previously read "super majority". We have lowered the low water
>> mark
 to "simple majority" to balance strong consensus against risk of stall
>>> due
 to low participation.


   - Vote will run through 6/24/20
   - pmc votes considered binding
   - simple majority of binding participants passes the vote
   - committer and community votes considered advisory

 Lastly, I propose we take the count of pmc votes in this thread as our
 initial roll call count for electorate numbers and low watermark
 calculation on subsequent votes.

 Thanks again everyone (and specifically Benedict and Jon) for the time
>>> and
 collaboration on this.

 ~Josh

>>>
>>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 4.0-beta1

2020-07-16 Thread Yifan Cai
+1 nb


From: Robert Stupp 
Sent: Thursday, July 16, 2020 2:59:34 AM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Release Apache Cassandra 4.0-beta1

+1 (nb)

—
Robert Stupp
@snazy

> On 15. Jul 2020, at 20:07, Jasonstack Zhao Yang  
> wrote:
>
> +1 (nb)
>
> On Thu, 16 Jul 2020 at 01:28, Brandon Williams  wrote:
>
>> +1 (binding)
>>
>> On Tue, Jul 14, 2020, 6:06 PM Mick Semb Wever  wrote:
>>
>>> Proposing the test build of Cassandra 4.0-beta1 for release.
>>>
>>> sha1: 5e767711360ecc4bc05a7cd219f0e680bfada004
>>> Git:
>>>
>>>
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-beta1-tentative
>>> Maven Artifacts:
>>>
>>>
>> https://repository.apache.org/content/repositories/orgapachecassandra-1210/org/apache/cassandra/cassandra-all/4.0-beta1/
>>>
>>> The Source and Build Artifacts, and the Debian and RPM packages and
>>> repositories, are available here:
>>> https://dist.apache.org/repos/dist/dev/cassandra/4.0-beta1/
>>>
>>> The vote will be open for 72 hours (longer if needed). Everyone who has
>>> tested the build is invited to vote. Votes by PMC members are considered
>>> binding. A vote passes if there are at least three binding +1s and no
>> -1s.
>>>
>>> Eventual publishing and announcement of the 4.0-beta1 release will be
>>> coordinated, as described in
>>>
>>>
>> https://lists.apache.org/thread.html/r537fe799e7d5e6d72ac791fdbe9098ef0344c55400c7f68ff65abe51%40%3Cdev.cassandra.apache.org%3E
>>>
>>> [1]: CHANGES.txt:
>>>
>>>
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-beta1-tentative
>>> [2]: NEWS.txt:
>>>
>>>
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-beta1-tentative
>>>
>>



Re: [DISCUSS] Change style guide to recommend use of @Override

2020-09-01 Thread Yifan Cai
+1


From: Caleb Rackliffe 
Sent: Tuesday, September 1, 2020 12:08:50 PM
To: dev@cassandra.apache.org 
Subject: Re: [DISCUSS] Change style guide to recommend use of @Override

+1

On Tue, Sep 1, 2020, 2:00 PM Jasonstack Zhao Yang 
wrote:

> +1
>
> On Wed, 2 Sep 2020 at 02:45, Dinesh Joshi  wrote:
>
> > +1
> >
> > > On Sep 1, 2020, at 11:27 AM, David Capwell  wrote:
> > >
> > > Currently our style guide recommends to avoid using @Override and
> updates
> > > intellij's code style to exclude it by default; I would like to propose
> > we
> > > change this recommendation to use it and to update intellij's style to
> > > include it by default.
> > >
> > > @Override is used by javac to enforce that a method is in fact
> overriding
> > > from an abstract class or an interface and if this stops being true
> (such
> > > as a refactor happens) then a compiler error is thrown; when we default
> > to
> > > excluding, it makes it harder to detect that a refactor catches all
> > > implementations and can lead to subtle and hard to track down bugs.
> > >
> > > This proposal is for new code and would not be to go rewrite all code
> at
> > > once, but would recommend new code adopt this style, and to pull old
> code
> > > forward which is related to changes being made (similar to our stance
> on
> > > imports).
> > >
> > > If people are ok with this, I will file a JIRA, update the docs, and
> > > update intellij's formatting.
> > >
> > > Thanks for your time!
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


Re: [VOTE] Accept the Harry donation

2020-09-16 Thread Yifan Cai
+1


From: Joshua McKenzie 
Sent: Wednesday, September 16, 2020 9:30:24 AM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Accept the Harry donation

+1


On Wed, Sep 16, 2020 at 11:22 AM, Aleksey Yeshchenko <
alek...@apple.com.invalid> wrote:

> +1
>
> On 16 Sep 2020, at 16:09, Sumanth Pasupuleti  com> wrote:
>
> +1 (non-binding)
>
> On Wed, Sep 16, 2020 at 7:41 AM Jon Meredith 
> wrote:
>
> +1 (non-binding)
>
> On Wed, Sep 16, 2020 at 8:28 AM David Capwell
>  wrote:
>
> +1
>
> Sent from my iPhone
>
> On Sep 16, 2020, at 6:34 AM, Brandon Williams 
>
> wrote:
>
> +1
>
> On Wed, Sep 16, 2020, 4:45 AM Mick Semb Wever  wrote:
>
> This vote is about officially accepting the Harry donation from Alex
>
> Petrov
>
> and Benedict Elliott Smith, that was worked on in CASSANDRA-15348.
>
> The Incubator IP Clearance has been filled out at
> http://incubator.apache.org/ip-clearance/apache-cassandra-harry.html
>
> This vote is a required part of the IP Clearance process. It follows
>
> the
>
> same voting rules as releases, i.e. from the PMC a minimum of three
>
> +1s and
>
> no -1s.
>
> Please cast your votes:
> [ ] +1 Accept the contribution into Cassandra
> [ ] -1 Do not
>
> - To
> unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional
> commands, e-mail: dev-h...@cassandra.apache.org
>
> - To
> unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional
> commands, e-mail: dev-h...@cassandra.apache.org
>
> - To
> unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional
> commands, e-mail: dev-h...@cassandra.apache.org
>


Re: [VOTE] Release dtest-api 0.0.5

2020-09-25 Thread Yifan Cai
+1 nb


From: Jon Meredith 
Sent: Friday, September 25, 2020 8:39:31 AM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Release dtest-api 0.0.5

+1 (non-binding)

On Fri, Sep 25, 2020 at 9:16 AM Marcus Eriksson  wrote:
>
> +1
>
> On 25 September 2020 at 17:13:36, Chris Lohfink (clohfin...@gmail.com) wrote:
> > +1
> >
> > On Fri, Sep 25, 2020 at 10:11 AM Caleb Rackliffe
> > wrote:
> >
> > > +1
> > >
> > > On Fri, Sep 25, 2020 at 10:08 AM Brandon Williams
> > > wrote:
> > >
> > > > +1
> > > >
> > > > On Fri, Sep 25, 2020, 9:45 AM Oleksandr Petrov <
> > > oleksandr.pet...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Proposing the test build of in-jvm dtest API 0.0.5 for release.
> > > > >
> > > > > Repository:
> > > > >
> > > > >
> > > >
> > > https://gitbox.apache.org/repos/asf?p=cassandra-in-jvm-dtest-api.git;a=shortlog;h=refs/tags/0.0.5
> > > > > Candidate SHA:
> > > > >
> > > > >
> > > >
> > > https://github.com/apache/cassandra-in-jvm-dtest-api/commit/f900334d2f61f0b10640ba7ae15958f26df72d92
> > > > > tagged with 0.0.5
> > > > > Artifact:
> > > > >
> > > > >
> > > >
> > > https://repository.apache.org/content/repositories/orgapachecassandra-1219/org/apache/cassandra/dtest-api/0.0.5/
> > > > >
> > > > > Key signature: 9E66CEC6106D578D0B1EB9BFF1000962B7F6840C
> > > > >
> > > > > Changes since last release:
> > > > >
> > > > > * CASSANDRA-16109: If user has not set nodeCount, use the node id
> > > > > topology size
> > > > > * CASSANDRA-16057: Update in-jvm dtest to expose stdout and stderr
> > > for
> > > > > nodetool
> > > > > * CASSANDRA-16120: Add ability for jvm-dtest to grep instance logs
> > > > > * CASSANDRA-16101: Add method to ignore uncaught throwables
> > > > > * CASSANDRA-16109: Collect dc/rack information and validate when
> > > > building
> > > > > * CASSANDRA-15386: Default to 3 datadirs in in-jvm dtests
> > > > > * CASSANDRA-16101: Add method to fetch uncaught exceptions
> > > > >
> > > > > The vote will be open for 24 hours. Everyone who has tested the build
> > > is
> > > > > invited to vote. Votes by PMC members are considered binding. A vote
> > > > passes
> > > > > if there are at least three binding +1s.
> > > > >
> > > > > -- Alex
> > > > >
> > > >
> > >
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 4.0-beta3

2020-10-31 Thread Yifan Cai
+1 nb


From: Scott Andreas 
Sent: Saturday, October 31, 2020 5:44:55 PM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Release Apache Cassandra 4.0-beta3

+1 nb

> On Oct 31, 2020, at 11:38 AM, Brandon Williams  wrote:
>
> +1
>
> Signatures and checksums match, source build works, as does dis/enablebinary.
>
> On Thu, Oct 29, 2020 at 7:30 AM Mick Semb Wever  wrote:
>>
>> Proposing the test build of Cassandra 4.0-beta3 for release.
>>
>> sha1: be716b46f2cb3b2d1f01dc225396c6284d5a35de
>> Git: 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-beta3-tentative
>> Maven Artifacts:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1224/org/apache/cassandra/cassandra-all/4.0-beta3/
>>
>> The Source and Build Artifacts, and the Debian and RPM packages and
>> repositories, are available here:
>> https://dist.apache.org/repos/dist/dev/cassandra/4.0-beta3/
>>
>> The vote will be open for 72 hours (longer if needed). Everyone who
>> has tested the build is invited to vote. Votes by PMC members are
>> considered binding. A vote passes if there are at least three binding
>> +1s and no -1's.
>>
>> [1]: CHANGES.txt:
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-beta3-tentative
>> [2]: NEWS.txt: 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-beta3-tentative
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 4.0-beta4

2020-12-21 Thread Yifan Cai
+1 nb


From: Marcus Eriksson 
Sent: Monday, December 21, 2020 1:54:23 AM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Release Apache Cassandra 4.0-beta4

+1

On Fri, Dec 18, 2020 at 08:16:16PM +0100, Mick Semb Wever wrote:
> Proposing the test build of Cassandra 4.0-beta4 for release.
>
> sha1: b0c50c10dbc443a05662b111a971a65cafa258d5
> Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-beta4-tentative
> Maven Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1226/org/apache/cassandra/cassandra-all/4.0-beta4/
>
> The Source and Build Artifacts, and the Debian and RPM packages and
> repositories, are available here:
> https://dist.apache.org/repos/dist/dev/cassandra/4.0-beta4/
>
> The vote will be open for 72 hours (longer if needed). Everyone who has
> tested the build is invited to vote. Votes by PMC members are considered
> binding. A vote passes if there are at least three binding +1s and no -1's.
>
> [1]: CHANGES.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-beta4-tentative
> [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-beta4-tentative

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] When to stop supporting Python 2

2021-01-25 Thread Yifan Cai
+1 nb.
We probably also want to set a milestone to get rid of the python2
compatible code completely, if we are going in the direction that drops
python2 support in 4.0 and retains the python2 compatible code. In 4.x or
5.0?

On Mon, Jan 25, 2021 at 9:24 AM Ekaterina Dimitrova 
wrote:

> I support the idea,  we are not removing python2-compatible code
> +1
>
> On Fri, 22 Jan 2021 at 15:14, Adam Holmberg 
> wrote:
>
> > As you may recall, CASSANDRA-10190 [1] introduced Python 3 support for
> > cqlsh. This change will be landing in 4.0. In the course of development
> and
> > discussion spanning years, it was decided to retain support for Python 2.
> > In the meantime, Python 2 sunsetted (a year ago [2]). I hadn't seen a
> > discussion about whether we intend to carry on support for Python 2, so
> I'm
> > raising one here.
> >
> > 4.0 is a major release and we have an opportunity to drop support at this
> > milestone. It has been mentioned that it will not be acceptable to do in
> a
> > minor or patch release, so if it's not done for 4.0, we will need to wait
> > for the next major. I do understand that many in the project would like
> > majors on a more frequent interval post-4.0, but at this time we don't
> know
> > when that will be.
> >
> > I advocate for dropping support ASAP. I expect that users should not be
> > inconvenienced by this -- I am not aware of a major distro that has not
> had
> > python3 for years. Dropping python2 support does not mean that we would
> do
> > work to rip out python2-compatible code, just that we wouldn't advertise
> > support and any package requirements would be adjusted. We benefit by
> > removing the need to test multiple runtimes, and we wouldn't be concerned
> > with fixing python2-specific issues that may arise on the EOL runtime
> [3].
> >
> > I look forward to the discussion.
> >
> > --
> > Adam Holmberg
> > e. adam.holmb...@datastax.com
> > w. www.datastax.com
> >
> > [1] https://issues.apache.org/jira/browse/CASSANDRA-10190
> > [2] https://www.python.org/doc/sunset-python-2/
> > [3] https://issues.apache.org/jira/browse/CASSANDRA-16400
> >
>


Re: [VOTE] Release Apache Cassandra 3.0.24

2021-01-29 Thread Yifan Cai
+1 nb

On Fri, Jan 29, 2021 at 7:11 AM Aleksey Yeshchenko
 wrote:

> +1
>
> > On 29 Jan 2021, at 14:31, Ekaterina Dimitrova 
> wrote:
> >
> > +1(nb)
> >
> > On Fri, 29 Jan 2021 at 8:21, Oleksandr Petrov <
> oleksandr.pet...@gmail.com>
> > wrote:
> >
> >>> Proposing the test build of Cassandra 4.0-beta4 for release.
> >>
> >> Correction: test build of 3.0.24. The rest looks right.
> >>
> >> On Fri, Jan 29, 2021 at 1:48 PM Oleksandr Petrov <
> >> oleksandr.pet...@gmail.com>
> >> wrote:
> >>
> >>> Proposing the test build of Cassandra 4.0-beta4 for release.
> >>>
> >>> sha1: 6748ecd63cae047b5b0e8c3165088252954e9d5f
> >>> Git:
> >>>
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.0.24-tentative
> >>> Maven Artifacts:
> >>>
> >>>
> >>
> https://repository.apache.org/content/repositories/orgapachecassandra-1229/org/apache/cassandra/cassandra-all/3.0.24/
> >>>
> >>> The Source and Build Artifacts, and the Debian and RPM packages and
> >>> repositories, are available here:
> >>> https://dist.apache.org/repos/dist/dev/cassandra/3.0.24/
> >>>
> >>> The vote will be open for 72 hours (longer if needed). Everyone who has
> >>> tested the build is invited to vote. Votes by PMC members are
> considered
> >>> binding. A vote passes if there are at least three binding +1s and no
> >> -1's.
> >>>
> >>> [1]: CHANGES.txt:
> >>>
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/3.0.24-tentative
> >>> [2]: NEWS.txt:
> >>>
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/3.0.24-tentative
> >>>
> >>
> >>
> >> --
> >> alex p
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 3.11.10

2021-01-29 Thread Yifan Cai
+1 nb

On Fri, Jan 29, 2021 at 7:11 AM Aleksey Yeschenko 
wrote:

> +1
>
> > On 29 Jan 2021, at 14:30, Ekaterina Dimitrova 
> wrote:
> >
> > +1(nb)
> >
> > On Fri, 29 Jan 2021 at 8:04, Mick Semb Wever  wrote:
> >
> >>>
> >>> The vote will be open for 72 hours (longer if needed). Everyone who has
> >>> tested the build is invited to vote. Votes by PMC members are
> considered
> >>> binding. A vote passes if there are at least three binding +1s and no
> >> -1's.
> >>>
> >>
> >>
> >> +1
> >>
> >>
> >> Checks
> >> - signing correct
> >> - checksums are correct
> >> - source artefact builds
> >> - binary artefact runs
> >> - debian package runs
> >> - redhat package runs
> >>
> >> ( used https://github.com/apache/cassandra-builds/pull/32 )
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Welcome Joey Lynch as Cassandra PMC member

2024-07-24 Thread Yifan Cai
Congrats Joey!

From: Abe Ratnofsky 
Sent: Wednesday, July 24, 2024 7:55:24 AM
To: dev@cassandra.apache.org 
Subject: Re: Welcome Joey Lynch as Cassandra PMC member

Congratulations!


[DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-07-26 Thread Yifan Cai
Hi everyone,

CASSANDRA-19800 is currently in the state of ready to be committed. Before
that, I want to propose backporting it to 4.0, 4.1 and 5.0.

The ability to notify CQLSSTableWriter user when new sstables are produced
is especially useful for Cassandra Analytics and other consumers. The API
is more reliable than monitoring the file directory.

That being said, I am aware that the patch is an improvement and trunk
only. I want to ask for an exemption on backporting the patch for two
reasons. It is useful for Cassandra Analytics. The patch is low risk for
Cassandra server as it only touches CQLSSTableWriter, which is only used by
toolings.

- Yifan


Re: [DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-07-26 Thread Yifan Cai
Thanks Jeff for restating the policy.

According to the release lifecycle doc

>
>- Missing features from newer generation releases are back-ported on
>per - PMC vote basis.
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle

We do not have a policy to prevent new features strictly for the branches
in maintenance state.

IMO, the patch qualifies as the missing feature. (As said, it is useful for
Cassandra Analytics, and it is good to have the same bridge implementation
amongst different cassandra versions)

Therefore, I would like to call for a vote.

On Fri, Jul 26, 2024 at 10:25 AM Jeff Jirsa  wrote:

> Everyone has a low risk change they want to backport to every branch, 4.0
> and 4.1 in particular are way past the point we should be adding features
>
> The policy exists and it’s a pure feature not a regression
>
>
>
>
>
> > On Jul 26, 2024, at 9:59 AM, Brandon Williams  wrote:
> >
> > Given how low risk this is, I don't see an issue with backporting it
> > and I'm sure the usefulness outweighs what risk there is. +1 (5.0.1
> > though, not 5.0.0)
> >
> > Kind Regards,
> > Brandon
> >
> >> On Fri, Jul 26, 2024 at 11:52 AM Yifan Cai  wrote:
> >>
> >> Hi everyone,
> >>
> >> CASSANDRA-19800 is currently in the state of ready to be committed.
> Before that, I want to propose backporting it to 4.0, 4.1 and 5.0.
> >>
> >> The ability to notify CQLSSTableWriter user when new sstables are
> produced is especially useful for Cassandra Analytics and other consumers.
> The API is more reliable than monitoring the file directory.
> >>
> >> That being said, I am aware that the patch is an improvement and trunk
> only. I want to ask for an exemption on backporting the patch for two
> reasons. It is useful for Cassandra Analytics. The patch is low risk for
> Cassandra server as it only touches CQLSSTableWriter, which is only used by
> toolings.
> >>
> >> - Yifan
>


Re: [DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-07-26 Thread Yifan Cai
Hi Jeremiah,

It is an interesting idea. As of now, I think it is too much of a risk (or
not feasible at all) to only use 5.0/trunk Cassandra-all dependency in
Cassandra Analytics, since it depends on other components in Cassandra.

- Yifan


Re: [DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-07-29 Thread Yifan Cai
It sounds like we are all good with backporting to 5.0.

Thank you all for the feedback.

- Yifan

On Fri, Jul 26, 2024 at 12:21 PM Jeff Jirsa  wrote:

>
>
> On Jul 26, 2024, at 11:09 AM, Yifan Cai  wrote:
>
> 
> Thanks Jeff for restating the policy.
>
> According to the release lifecycle doc
>
>>
>>- Missing features from newer generation releases are back-ported on
>>per - PMC vote basis.
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
>
> We do not have a policy to prevent new features strictly for the branches
> in maintenance state.
>
> IMO, the patch qualifies as the missing feature. (As said, it is useful
> for Cassandra Analytics, and it is good to have the same bridge
> implementation amongst different cassandra versions)
>
> Therefore, I would like to call for a vote.
>
>
> Sure
>
> I’m -1 on 4.0 and 4.1
>
> - Jeff
>
>
> On Fri, Jul 26, 2024 at 10:25 AM Jeff Jirsa  wrote:
>
>> Everyone has a low risk change they want to backport to every branch, 4.0
>> and 4.1 in particular are way past the point we should be adding features
>>
>> The policy exists and it’s a pure feature not a regression
>>
>>
>>
>>
>>
>> > On Jul 26, 2024, at 9:59 AM, Brandon Williams  wrote:
>> >
>> > Given how low risk this is, I don't see an issue with backporting it
>> > and I'm sure the usefulness outweighs what risk there is. +1 (5.0.1
>> > though, not 5.0.0)
>> >
>> > Kind Regards,
>> > Brandon
>> >
>> >> On Fri, Jul 26, 2024 at 11:52 AM Yifan Cai  wrote:
>> >>
>> >> Hi everyone,
>> >>
>> >> CASSANDRA-19800 is currently in the state of ready to be committed.
>> Before that, I want to propose backporting it to 4.0, 4.1 and 5.0.
>> >>
>> >> The ability to notify CQLSSTableWriter user when new sstables are
>> produced is especially useful for Cassandra Analytics and other consumers.
>> The API is more reliable than monitoring the file directory.
>> >>
>> >> That being said, I am aware that the patch is an improvement and trunk
>> only. I want to ask for an exemption on backporting the patch for two
>> reasons. It is useful for Cassandra Analytics. The patch is low risk for
>> Cassandra server as it only touches CQLSSTableWriter, which is only used by
>> toolings.
>> >>
>> >> - Yifan
>>
>


Re: [DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-07-30 Thread Yifan Cai
Here is my 2 cents. Maybe we need to differentiate the user-facing
improvements and ecosystem-internal improvements, or have a discussion
about it.
I guess when the current policy of "improvements and new features on trunk
only" was made, it was to target the user-facing improvements. The internal
changes are not exposed to cassandra users directly.
As Josh pointed out, with more projects (sidecar and analytics have
dependency on cassandra public interface) in the ecosystem, we are more
likely to encounter the scenarios where we want to modify the mainline
branches for integration purposes.

The downside of preventing the integration updates to the older branches is
having different solutions per Cassandra version in the other projects
under the Cassandra umbrella. It is a maintenance pain and potentially
causes errors. It is my original motivation of backporting the patch to the
other branches.

- Yifan

On Tue, Jul 30, 2024 at 6:04 AM Josh McKenzie  wrote:

> Some thoughts:
>
>1. Most of our PMC votes are majority-based, not binding -1. So Jeff
>being -1 doesn't mean the whole PMC being -1. So don't take his -1 as being
>a show stopper or indicative of everyone on the PMC (and don't take me
>saying this as the converse ;))
>2. I expect we have a lot of debt when it comes to our ecosystem
>integrations on older branches. Bringing those projects into the ASF
>umbrella and into the project ecosystem is at odds with a hard policy of
>"we don't add improvements or new features to old branches",
>*specifically* in cases like this where the desire is to get uniform
>support for ecosystem projects across all supported branches of C*
>3. We're moving into a world where we will likely more frequently
>modify the mainline branch with new functionality to integrate with
>ecosystem changes (sidecar, analytics, drivers?). It's probably at least
>worth a conversation as to whether our current policy (improvements and new
>features main branch only) is optimal across everything equally or if there
>should be nuance for ecosystem integrations.
>4. To Jeff's point: everyone is always going to have some minor
>improvement they'd like to back-port to older branches.
>
> I haven't thought deeply enough about this specific situation to have a
> well formed opinion, but figured calling out the above things is worth
> doing. This probably won't be the last time we look at our supported
> branches and have some pain we'd like to address based on the inconsistent
> ecosystem support and API piece across them.
>
> On Mon, Jul 29, 2024, at 1:32 PM, Yifan Cai wrote:
>
> It sounds like we are all good with backporting to 5.0.
>
> Thank you all for the feedback.
>
> - Yifan
>
> On Fri, Jul 26, 2024 at 12:21 PM Jeff Jirsa  wrote:
>
>
>
>
> On Jul 26, 2024, at 11:09 AM, Yifan Cai  wrote:
>
> 
> Thanks Jeff for restating the policy.
>
> According to the release lifecycle doc
>
>
>- Missing features from newer generation releases are back-ported on
>per - PMC vote basis.
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle
>
> We do not have a policy to prevent new features strictly for the branches
> in maintenance state.
>
> IMO, the patch qualifies as the missing feature. (As said, it is useful
> for Cassandra Analytics, and it is good to have the same bridge
> implementation amongst different cassandra versions)
>
> Therefore, I would like to call for a vote.
>
>
> Sure
>
> I’m -1 on 4.0 and 4.1
>
> - Jeff
>
>
> On Fri, Jul 26, 2024 at 10:25 AM Jeff Jirsa  wrote:
>
> Everyone has a low risk change they want to backport to every branch, 4.0
> and 4.1 in particular are way past the point we should be adding features
>
> The policy exists and it’s a pure feature not a regression
>
>
>
>
>
> > On Jul 26, 2024, at 9:59 AM, Brandon Williams  wrote:
> >
> > Given how low risk this is, I don't see an issue with backporting it
> > and I'm sure the usefulness outweighs what risk there is. +1 (5.0.1
> > though, not 5.0.0)
> >
> > Kind Regards,
> > Brandon
> >
> >> On Fri, Jul 26, 2024 at 11:52 AM Yifan Cai  wrote:
> >>
> >> Hi everyone,
> >>
> >> CASSANDRA-19800 is currently in the state of ready to be committed.
> Before that, I want to propose backporting it to 4.0, 4.1 and 5.0.
> >>
> >> The ability to notify CQLSSTableWriter user when new sstables are
> produced is especially useful for Cassandra Analytics and other consumers.
> The API is more reliable than monitoring the file directory.
> >>
> >> That being said, I am aware that the patch is an improvement and trunk
> only. I want to ask for an exemption on backporting the patch for two
> reasons. It is useful for Cassandra Analytics. The patch is low risk for
> Cassandra server as it only touches CQLSSTableWriter, which is only used by
> toolings.
> >>
> >> - Yifan
>
>
>


Re: [DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-07-31 Thread Yifan Cai
Hi PMC team,

There are so far two +1 and one -1. Please vote if you want to. It is open
for another 12 hours.

4.1 is to be released. I would like to include the patch, if possible,
according to the vote result.

I recognize that patches to stable releases can be risky. When talking
about the trade off, we need to evaluate the benefit holistically,
considering all the projects under the Cassandra umbrella.
We surely do not want to backport the user-facing Cassandra features and
potentially demotivating upgrade.
IMO, the internal changes should be treated differently, as long as the
public interface does not change. In this particular case, Cassandra
Analytics provides the same bulk write feature regardless of backporting
the patch or not. But having the patch backported improves the code quality
and brings other benefits. Hope it answers Mick's message.

Maybe it is time we start to think about categorizing the interfaces in
Cassandra into 1) public API, 2) internal API, and 3) evolving API, etc. It
is not a suitable topic for this thread and requires a separate discussion.

- Yifan

On Tue, Jul 30, 2024 at 11:47 AM Mick Semb Wever  wrote:

> reply below.
>
>  We're moving into a world where we will likely more frequently modify
>> the mainline branch with new functionality to integrate with ecosystem
>> changes (sidecar, analytics, drivers?). It's probably at least worth a
>> conversation as to whether our current policy (improvements and new
>> features main branch only) is optimal across everything equally or if there
>> should be nuance for ecosystem integrations.
>>
>
>
> This also incentivises intentionally not introducing support for that api
> in older mainlines.  We KISS, if the user wants that ecosystem benefit they
> need to upgrade to at least mainline X.
>
> Once older mainlines have it then we have this problem.  An alternative to
> the risk of having to always update all the mainlines, is to let the
> ecosystem branch to provide support for the different mainlines as/when
> needed.  Both are painful.
>
>


Re: [DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-07-31 Thread Yifan Cai
Hi Scott,

– What's this for? I'd appreciate a detailed explanation of what "Enhance
> CQLSSTableWriter to notify clients on sstable production" does and how it's
> meant to be used. Why is it needed for rolling upgrades? The phrasing of
> the ticket right now reads as nice-to-have rather than must-have. An
> earlier email described the value as "code quality and brings other
> benefits," but I'd expect the standard for feature backports to be higher.
>

The patch to the "CQLSSTableWriter" is to support sending notification to
clients (Cassandra Analytics) when a new sstable is produced. Cassandra
Analytics, on receiving the notifications, can send sstables more eagerly,
hence reclaiming local disk space sooner.
To clarify, the patch is *not* needed for rolling upgrade. I mentioned in
an earlier email that if not backporting, there will be a different
solution for the lower version. The value of backporting is to eliminate
the need to develop and maintain multiple solutions (in Cassandra
Analytics).

– What's not possible if this isn't backported? What experience suffers
> today for lack of it / what problem does it solve? And what is the
> alternative/fallback if others are not supportive of backport?
>

It has *no* impact on the Cassandra server. W/o backporting, it affects the
Cassandra Analytics only. The problem and the alternative are stated above.

And to the last question, the answer is that it is *not* required for
upgrade.

Thank you for putting the questions together. Others probably have the same
questions. Hopefully it clears your concerns.

I also agree that 4.1.x release should not be blocked by this patch.

- Yifan

On Wed, Jul 31, 2024 at 2:14 PM C. Scott Andreas 
wrote:

> There are a few things unclear to me in this thread and the ticket, and
> details in the Jira are slim.
>
> Yifan / others supportive of backporting this feature, could you help me
> answer these questions?
>
> – What's this for? I'd appreciate a detailed explanation of what "Enhance
> CQLSSTableWriter to notify clients on sstable production" does and how it's
> meant to be used. Why is it needed for rolling upgrades? The phrasing of
> the ticket right now reads as nice-to-have rather than must-have. An
> earlier email described the value as "code quality and brings other
> benefits," but I'd expect the standard for feature backports to be higher.
>
> – What's not possible if this isn't backported? What experience suffers
> today for lack of it / what problem does it solve? And what is the
> alternative/fallback if others are not supportive of backport?
>
> – If this is deemed required for upgrade, that means users of previous
> releases would have to first upgrade to the latest release on their current
> train before upgrading to the latest major version. This is not a pattern
> that has been required in the past, and we should not introduce such a
> requirement. Do you intend this to be the required path for 4.1.x upgrades
> to 5.x+?
>
> My general thoughts:
>
> – The patch is small and the feature is small so I don't have much concern
> with a backport; zero-ish vote.
> – It definitely shouldn't block release of the current 4.1.x vote thread
> that's in progress.
> – We shouldn't introduce upgrade paths that require users to upgrade to
> at-minimum-patchlevel versions of current releases before upgrading to
> future majors; I'm -1 on that.
>
> Thanks,
>
> – Scott
>
> On Jul 31, 2024, at 2:04 PM, Jon Haddad  wrote:
>
>
> I'm kind of neutral on this, maybe -0.  It's a small enough patch, but
> it's of limited value, given that Cassandra Analytics doesn't work with
> vnodes. That's the overwhelming majority of deployments.  So I'm not really
> sure what we gain here.
>
> On Wed, Jul 31, 2024 at 1:58 PM Yifan Cai  wrote:
>
>> Hi PMC team,
>>
>> There are so far two +1 and one -1. Please vote if you want to. It is
>> open for another 12 hours.
>>
>> 4.1 is to be released. I would like to include the patch, if possible,
>> according to the vote result.
>>
>> I recognize that patches to stable releases can be risky. When talking
>> about the trade off, we need to evaluate the benefit holistically,
>> considering all the projects under the Cassandra umbrella.
>> We surely do not want to backport the user-facing Cassandra features and
>> potentially demotivating upgrade.
>> IMO, the internal changes should be treated differently, as long as the
>> public interface does not change. In this particular case, Cassandra
>> Analytics provides the same bulk write feature regardless of backporting
>>

Re: [DISCUSS] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-08-02 Thread Yifan Cai
I've realized that, although a vote is called, I forgot to create the
voting thread. I apologize for the oversight.

It looks like there's been no further discussion on this topic, so I'll go
ahead and set up a dedicated voting thread for the backport proposal
shortly.

Please vote once it is up.

- Yifan

On Thu, Aug 1, 2024 at 5:14 AM Sam Tunnicliffe  wrote:

> Sorry to derail the discussion but just on a point of order, there is
> actually precedent for requiring a minimum patch level before a major
> upgrade. For instance, from NEWS.txt:
>
> Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal
> to 2.1.9, or Cassandra 2.2 versions greater or equal to 2.2.2.
>
> This approach has also been mentioned [1][2] as a means to introduce a
> property or setting to disable schema changes and the like before a major
> upgrade, though in this case it would be optional not required.
>
> [1]
> https://issues.apache.org/jira/browse/CASSANDRA-19556?focusedCommentId=17848544&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17848544
> [2]
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata#CEP21:TransactionalClusterMetadata-MigrationPlan
>
> On 31 Jul 2024, at 22:13, C. Scott Andreas  wrote:
>
> There are a few things unclear to me in this thread and the ticket, and
> details in the Jira are slim.
>
> Yifan / others supportive of backporting this feature, could you help me
> answer these questions?
>
> – What's this for? I'd appreciate a detailed explanation of what "Enhance
> CQLSSTableWriter to notify clients on sstable production" does and how it's
> meant to be used. Why is it needed for rolling upgrades? The phrasing of
> the ticket right now reads as nice-to-have rather than must-have. An
> earlier email described the value as "code quality and brings other
> benefits," but I'd expect the standard for feature backports to be higher.
>
> – What's not possible if this isn't backported? What experience suffers
> today for lack of it / what problem does it solve? And what is the
> alternative/fallback if others are not supportive of backport?
>
> – If this is deemed required for upgrade, that means users of previous
> releases would have to first upgrade to the latest release on their current
> train before upgrading to the latest major version. This is not a pattern
> that has been required in the past, and we should not introduce such a
> requirement. Do you intend this to be the required path for 4.1.x upgrades
> to 5.x+?
>
> My general thoughts:
>
> – The patch is small and the feature is small so I don't have much concern
> with a backport; zero-ish vote.
> – It definitely shouldn't block release of the current 4.1.x vote thread
> that's in progress.
> – We shouldn't introduce upgrade paths that require users to upgrade to
> at-minimum-patchlevel versions of current releases before upgrading to
> future majors; I'm -1 on that.
>
> Thanks,
>
> – Scott
>
> On Jul 31, 2024, at 2:04 PM, Jon Haddad  wrote:
>
>
> I'm kind of neutral on this, maybe -0.  It's a small enough patch, but
> it's of limited value, given that Cassandra Analytics doesn't work with
> vnodes. That's the overwhelming majority of deployments.  So I'm not really
> sure what we gain here.
>
> On Wed, Jul 31, 2024 at 1:58 PM Yifan Cai  wrote:
>
>> Hi PMC team,
>>
>> There are so far two +1 and one -1. Please vote if you want to. It is
>> open for another 12 hours.
>>
>> 4.1 is to be released. I would like to include the patch, if possible,
>> according to the vote result.
>>
>> I recognize that patches to stable releases can be risky. When talking
>> about the trade off, we need to evaluate the benefit holistically,
>> considering all the projects under the Cassandra umbrella.
>> We surely do not want to backport the user-facing Cassandra features and
>> potentially demotivating upgrade.
>> IMO, the internal changes should be treated differently, as long as the
>> public interface does not change. In this particular case, Cassandra
>> Analytics provides the same bulk write feature regardless of backporting
>> the patch or not. But having the patch backported improves the code quality
>> and brings other benefits. Hope it answers Mick's message.
>>
>> Maybe it is time we start to think about categorizing the interfaces in
>> Cassandra into 1) public API, 2) internal API, and 3) evolving API, etc. It
>> is not a suitable topic for this thread and requires a separate discussion.
>>
>> - Yi

[VOTE] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-08-03 Thread Yifan Cai
Hi,

I am proposing backporting CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0.

There is a discussion thread
<https://lists.apache.org/thread/oojdsh7oy3pxszrnghxld7m0wmh4tk40> on the
topic. In summary, the backport would benefit Cassandra Analytics by
providing a unified solution, and the patch is considered low-risk. While
there are concerns about adding features to 4.0 and 4.1, there is generally
support for 5.0.

The vote will be open for 72 hours (longer if needed). Votes by PMC members
are considered binding. A vote passes if there are at least three binding
+1s and no -1's.

Kind regards,
Yifan Cai


Re: [DISCUSS] inotify for detection of manually removed snapshots

2024-08-07 Thread Yifan Cai
With WatcherService, when events are missed (which is to be expected), you
will still need to list the files. It seems to me that WatcherService
doesn't offer significant benefits in this case.

Regarding listing directory with a refresh flag, my concern is the
potential for abuse. End-users might/could always refresh before listing,
which could undermine the purpose of caching. Perhaps Jeremiah can provide
more insight on this.

IMO, caching is best handled internally. I have a few UX-related questions:
- Is it valid or acceptable to return stale data? If so, end-users have to
do some form of validation before consuming each snapshot to account for
potential deletions.
- Even if listsnapshot returns the most recent data, is it possible that
some of the directories get deleted when end-users are accessing them? I
think it is true. It, then, enforces end-users to do some validation first,
similar to handling stale data.

Just my 2 cents.

- Yifan

On Wed, Aug 7, 2024 at 6:03 AM Štefan Miklošovič 
wrote:

> Yes, for example as reported here
>
> https://issues.apache.org/jira/browse/CASSANDRA-13338
>
> People who are charting this in monitoring dashboards might also hit this.
>
> On Wed, Aug 7, 2024 at 2:59 PM J. D. Jordan 
> wrote:
>
>> If you have a lot of snapshots and have for example a metric monitoring
>> them and their sizes, if you don’t cache it, creating the metric can cause
>> performance degradation. We added the cache because we saw this happen to
>> databases more than once.
>>
>> > On Aug 7, 2024, at 7:54 AM, Josh McKenzie  wrote:
>> >
>> > 
>> >>
>> >> Snapshot metadata are currently stored in memory / they are cached so
>> we do not need to go to disk every single time we want to list them, the
>> more snapshots we have, the worse it is.
>> > Are we enumerating our snapshots somewhere on the hot path, or is this
>> performance concern misplaced?
>> >
>> >> On Wed, Aug 7, 2024, at 7:44 AM, Štefan Miklošovič wrote:
>> >> Snapshot metadata are currently stored in memory / they are cached so
>> we do not need to go to disk every single time we want to list them, the
>> more snapshots we have, the worse it is.
>> >>
>> >> When a snapshot is _manually_ removed from disk, not from nodetool
>> clearsnapshot, just by rm -rf on a respective snapshot directory, then such
>> snapshot will be still visible in nodetool listsnapshots. Manual removal of
>> a snapshot might be done e.g. by accident or by some "impatient" operator
>> who just goes to disk and removes it there instead of using nodetool or
>> respective JMX method.
>> >>
>> >> To improve UX here, what I came up with is that we might use Java's
>> WatchService where each snapshot dir would be registered. WatchService is
>> part of Java, it uses inotify subsystem which is what Linux kernel offers.
>> The result of doing it is that once a snapshot dir is registered to be
>> watched and when it is removed then we are notified about that via inotify
>> / WatchService so we can react on it and remove the in-memory
>> representation of that so it will not be visible in the output anymore.
>> >>
>> >> While this works, there are some questions / concerns
>> >>
>> >> 1) What do people think about inotify in general? I tested this on 10k
>> snapshots and it seems to work just fine, nevertheless there is in general
>> no strong guarantee that every single event will come through, there is
>> also a family of kernel parameters around this where more tuning can be
>> done etc. It is also questionable how this will behave on other systems
>> from Linux (Mac etc). While JRE running on different platforms also
>> implements this, I am not completely sure these implementations are
>> quality-wise the same as for Linux etc. There is a history of
>> not-so-quality implementations for other systems (events not coming through
>> on Macs etc) and while I think we are safe on Linux, I am not sure we want
>> to go with this elsewhere.
>> >>
>> >> 2) inotify brings more entropy into the codebase, it is another thing
>> we need to take care of etc (however, it is all concentrated in one class
>> and pretty much "isolated" from everything else)
>> >>
>> >> I made this feature optional and it is turned off by default so people
>> need to explicitly opt-in into this so we are not forcing it on anybody.
>> >>
>> >> If we do not want to go with inotify, another option would be to have
>> a background thread which would periodically check if a manifest exists on
>> a disk, if it does not, then a snapshot does not either. While this works,
>> what I do not like about this is that the primary reason we moved it to
>> memory was to bypass IO as much as possible yet here we would introduce
>> another check which would go to disk, and this would be done periodically,
>> which beats the whole purpose. If an operator lists snapshots once a week
>> and there is a background check running every 10 minutes (for example),
>> then the cummulative number of IO operations migth be bigger t

Re: Welcome Doug Rohrer as Cassandra Committer

2024-08-23 Thread Yifan Cai
Congrats Doug!

From: Jordan West 
Sent: Friday, August 23, 2024 1:19:04 PM
To: dev@cassandra.apache.org 
Subject: Re: Welcome Doug Rohrer as Cassandra Committer

Awesome! Congratulations Doug!

On Fri, Aug 23, 2024 at 12:17 Štefan Miklošovič 
mailto:smikloso...@apache.org>> wrote:
Great news! Congratulations, Doug.

On Fri, Aug 23, 2024 at 8:55 PM Dinesh Joshi 
mailto:djo...@apache.org>> wrote:
The Apache Cassandra PMC is thrilled to announce that Doug Rohrer has
accepted the invitation to become a committer!

Doug has worked on several aspects of Cassandra, Sidecar, and
Analytics. Congratulations and welcome!

The Apache Cassandra PMC members


Re: [VOTE] Backport CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0

2024-08-25 Thread Yifan Cai
The vote passes with 3 binding +1, 3 non binding, and no vetoes.

Thanks to everyone who was part of the discussion!

- Yifan

On Wed, Aug 7, 2024 at 3:57 PM Brandon Williams  wrote:

> +1 for reasons stated in the discussion.
>
> Kind Regards,
> Brandon
>
> On Sun, Aug 4, 2024 at 1:18 AM Yifan Cai  wrote:
> >
> > Hi,
> >
> > I am proposing backporting CASSANDRA-19800 to Cassandra-4.0, 4.1 and 5.0.
> >
> > There is a discussion thread on the topic. In summary, the backport
> would benefit Cassandra Analytics by providing a unified solution, and the
> patch is considered low-risk. While there are concerns about adding
> features to 4.0 and 4.1, there is generally support for 5.0.
> >
> > The vote will be open for 72 hours (longer if needed). Votes by PMC
> members are considered binding. A vote passes if there are at least three
> binding +1s and no -1's.
> >
> > Kind regards,
> > Yifan Cai
>


Re: Welcome Jordan West and Stefan Miklosovic as Cassandra PMC members!

2024-08-30 Thread Yifan Cai
Congrats Jordan and Stefan!

From: Sumanth Pasupuleti 
Sent: Friday, August 30, 2024 1:31:01 PM
To: dev@cassandra.apache.org 
Subject: Re: Welcome Jordan West and Stefan Miklosovic as Cassandra PMC members!

Congratulations Jordan and Stefan!!!

On Fri, Aug 30, 2024 at 1:21 PM Jon Haddad 
mailto:j...@jonhaddad.com>> wrote:
The PMC's members are pleased to announce that Jordan West and Stefan 
Miklosovic have accepted invitations to become PMC members.

Thanks a lot, Jordan and Stefan, for everything you have done for the project 
all these years.

Congratulations and welcome!!

The Apache Cassandra PMC


Re: Welcome Berenguer Blasi as Cassandra committer

2021-03-26 Thread Yifan Cai


Congratulations Berenguer!

- Yifan

> On Mar 26, 2021, at 11:49 AM, Sumanth Pasupuleti 
>  wrote:
> 
> Congratulations Berenguer!

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Releases after 4.0

2021-03-29 Thread Yifan Cai
+1

On Mon, Mar 29, 2021 at 8:42 AM J. D. Jordan 
wrote:

> +1 that deprecation schedule seems reasonable and a good thing to move to.
>
> > On Mar 29, 2021, at 10:23 AM, Benjamin Lerer  wrote:
> >
> > The proposal sounds good to me too.
> >
> >> Le lun. 29 mars 2021 à 16:48, Brandon Williams  a
> écrit :
> >>
> >>> On Mon, Mar 29, 2021 at 9:41 AM Joseph Lynch 
> >>> wrote:
> >>> I like the idea of the 3-year support cycles, but I think since
> >>> 3.0/3.11/4.0 took so long to stabilize to a point folks could upgrade
> >>> to, we should reset the clock somewhat.
> >>
> >> I agree, the length of time to release 4.0 and the initialization of a
> >> new release cycle requires some special consideration for current
> >> releases.
> >>
> >>> 4.0: Fully supported until April 2023 and high severity bugs until
> >>> April 2024 (2 year full, 1 year bugfix)
> >>> 3.11: Fully supported until April 2022 and high severity bugs until
> >>> April 2023 (1 year full, 1 year bugfix).
> >>> 3.0: Supported for high severity correctness/performance bugs until
> >>> April 2022 (1 year bugfix)
> >>> 2.2+2.1: EOL immediately.
> >>>
> >>> Then going forward we could have this nice pattern when we cut the
> >>> yearly release:
> >>> Y(n-0): Support for 3 years from now (2 full, 1 bugfix)
> >>> Y(n-1): Fully supported for 1 more year and supported for high
> >>> severity correctness/perf bugs 1 year after that (1 full, 1 bugfix)
> >>> Y(n-2): Supported for high severity correctness/bugs for 1 more year (1
> >> bugfix)
> >>
> >> This sounds excellent to me, +1.
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 4.0-rc1

2021-03-29 Thread Yifan Cai
+1 nb

On Mon, Mar 29, 2021 at 9:33 AM Aleksey Yeshchenko
 wrote:

> +1
>
> > On 29 Mar 2021, at 14:05, Mick Semb Wever  wrote:
> >
> > Proposing the test build of Cassandra 4.0-rc1 for release.
> >
> > sha1: 2facbc97ea215faef1735d9a3d5697162f61bc8c
> > Git:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-rc1-tentative
> > Maven Artifacts:
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1234/org/apache/cassandra/cassandra-all/4.0-rc1/
> >
> > The Source and Build Artifacts, and the Debian and RPM packages and
> > repositories, are available here:
> > https://dist.apache.org/repos/dist/dev/cassandra/4.0-rc1/
> >
> > The vote will be open for 72 hours (longer if needed). Everyone who has
> > tested the build is invited to vote. Votes by PMC members are considered
> > binding. A vote passes if there are at least three binding +1s and no
> -1's.
> >
> > Known issues with this release, that are planned to be fixed in 4.0-rc2,
> are
> > - four files were missing copyright headers,
> > - LICENSE and NOTICE contain additional unneeded information,
> > - jar files under lib/ in the source artefact.
> >
> > These issues are actively being worked on, along with our expectations
> that
> > the ASF makes the policy around them more explicit so it is clear exactly
> > what is required of us.
> >
> >
> > [1]: CHANGES.txt:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-rc1-tentative
> > [2]: NEWS.txt:
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-rc1-tentative
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] Remove support for `test.runners` and `testparallel`

2021-04-13 Thread Yifan Cai
+1 to remove ant test parallelism and leverage container for it.

- Yifan

> On Apr 13, 2021, at 4:00 AM, Angelo Polo  wrote:
> 
> Docker doesn't run natively on FreeBSD (though work is underway to enable
> that). It's possible to run Docker Machine inside VirtualBox so maybe
> that's workable, otherwise I suppose I can live without parallel testing
> for now since I'm probably the only one.
> 
> Best,
> Angelo
> 
> On Tue, Apr 13, 2021 at 10:59 AM Mick Semb Wever  wrote:
> 
>>> +1 after chatting with Mick who clarified the picture for me. Thx Mick.
>> 
>> 👍
>> 
>> I'm +1 as well to removing test.runner and testparallel support, from
>> all branches.
>> 
>> CASSANDRA-16595 has been created.
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 4.0-rc1 (take2)

2021-04-21 Thread Yifan Cai
+1

On Wed, Apr 21, 2021 at 10:33 PM Berenguer Blasi 
wrote:

> +1
>
> On 22/4/21 5:12, Blake Eggleston wrote:
> > +1
> >
> >> On Apr 21, 2021, at 2:25 PM, Scott Andreas 
> wrote:
> >>
> >> +1nb, thank you!
> >>
> >> 
> >> From: Ekaterina Dimitrova 
> >> Sent: Wednesday, April 21, 2021 12:23 PM
> >> To: dev@cassandra.apache.org
> >> Subject: Re: [VOTE] Release Apache Cassandra 4.0-rc1 (take2)
> >>
> >> +1 and thanks everyone for all the hard work
> >>
> >> Checked:
> >> - gpg signatures
> >> - sha checksums
> >> - binary convenience artifact runs
> >> - src convenience artifacts builds with one command, and runs
> >> - deb and rpm install and run
> >>
> >>> On Wed, 21 Apr 2021 at 14:57, Michael Semb Wever 
> wrote:
> >>>
> >>>
>  The vote will be open for 72 hours (longer if needed). Everyone who
>  has tested the build is invited to vote. Votes by PMC members are
>  considered binding. A vote passes if there are at least three binding
>  +1s and no -1's.
> >>>
> >>> +1
> >>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Welcome Stefan Miklosovic as Cassandra committer

2021-05-03 Thread Yifan Cai
Congrats!

On Mon, May 3, 2021 at 1:23 PM Paulo Motta  wrote:

> Congrats, Stefan! Happy to see you onboard! :)
>
> Em seg., 3 de mai. de 2021 às 17:17, Ben Bromhead 
> escreveu:
>
> > Congrats mate!
> >
> > On Tue, May 4, 2021 at 4:20 AM Scott Andreas 
> wrote:
> >
> > > Congratulations, Štefan!
> > >
> > > 
> > > From: David Capwell 
> > > Sent: Monday, May 3, 2021 10:53 AM
> > > To: dev@cassandra.apache.org
> > > Subject: Re: Welcome Stefan Miklosovic as Cassandra committer
> > >
> > > Congrats!
> > >
> > > > On May 3, 2021, at 9:47 AM, Ekaterina Dimitrova <
> e.dimitr...@gmail.com
> > >
> > > wrote:
> > > >
> > > > Congrat Stefan! Well done!!
> > > >
> > > > On Mon, 3 May 2021 at 11:49, J. D. Jordan  >
> > > wrote:
> > > >
> > > >> Well deserved!  Congrats Stefan.
> > > >>
> > > >>> On May 3, 2021, at 10:46 AM, Sumanth Pasupuleti <
> > > >> sumanth.pasupuleti...@gmail.com> wrote:
> > > >>>
> > > >>> Congratulations Stefan!!
> > > >>>
> > >  On Mon, May 3, 2021 at 8:41 AM Brandon Williams  >
> > > >> wrote:
> > > 
> > >  Congratulations, Stefan!
> > > 
> > > > On Mon, May 3, 2021 at 10:38 AM Benjamin Lerer <
> b.le...@gmail.com>
> > > >> wrote:
> > > >
> > > > The PMC's members are pleased to announce that Stefan Miklosovic
> > has
> > > > accepted the invitation to become committer last Wednesday.
> > > >
> > > > Thanks a lot, Stefan,  for all your contributions!
> > > >
> > > > Congratulations and welcome
> > > >
> > > > The Apache Cassandra PMC members
> > > 
> > > 
> > -
> > >  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >  For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > 
> > > 
> > > >>
> > > >>
> -
> > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>
> > > >>
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> >  | +64 27 383 8975
> >
>


Re: Welcome Caleb Rackliffe as Cassandra committer

2021-05-14 Thread Yifan Cai
Congrats Caleb!

> On May 14, 2021, at 6:56 AM, Joshua McKenzie  wrote:
> 
> Congrats Caleb!
> 
>> On Fri, May 14, 2021 at 9:10 AM Brandon Williams  wrote:
>> 
>> Congrats Caleb! Well deserved.
>> 
>>> On Fri, May 14, 2021, 8:03 AM Mick Semb Wever  wrote:
>>> 
>>> The PMC members are pleased to announce that Caleb Rackliffe has
>>> accepted the invitation to become committer.
>>> 
>>> Thanks heaps Caleb for helping make Cassandra awesome!
>>> 
>>> Congratulations and welcome,
>>> The Apache Cassandra PMC members
>>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



[DISCUSS] CASSANDRA-16760 - JMXTimer exposes attributes in inconsistent time units

2021-06-24 Thread Yifan Cai
Hi,

In the current codebase, JMXTimer exposes its attributes in inconsistent
time units. The percentiles, Mean and DurationUnit attributes are using
micros. But the Values and RecentValues are based on nanos, since the
underlying Timer collects the time values in nanos.

The inconsistency leads to confusion and misinterpretation of the values,
if the end user is not familiar with the implementation details. One may
consider the Values and RecentValues are also in micros as mentioned in the
DurationUnit.

Besides the confusion, given the intention is to record the time values in
the micros resolution, we do not need to allocate 165 buckets in the
DecayingEstimatedHistogramReservoir. 165 buckets is necessary for nanos,
but not for micros. We can only allocate 90 buckets and it should reduce
~50% memory footprint used by the Timers.

It is relatively a small change to unify the exposed values in the same
unit. But it changes the exposed metrics API, I'd like to start the
discussion thread to gather your opinions. And hope to avoid breaking your
tooling.

There are several options (all for timers specifically):

   1. Enforce the consistency of the time unit.
  - Change all JMXTimers to store values in micros and reduce the
  bucket size to 90. The change has no impact on reading the
statistics. But
  the long[] of Values and RecentValues is reduced to 91, and the
values are
  based on micros.
  - Change all JMXTimer to store values in nanos. The change makes the
  percentiles, mean values returned in nanos. But has no impact on the
  histogram raw values, i.e., Values and RecentValues.
   2. Having a toggle to either keep the current inconsistency or records
   all in micros. This is less invasive than option 1. And it does not affect
   your monitoring tooling if it reads the Values (histogram raw values) at
   nanos resolution.

I'd prefer option 1. So the DurationUnit attribute correctly annotates the
other attributes from the JMXTimer. For most of the timers, we do not need
the nanos resolution. Recording them in micros halves the memory footprint
for timers. If some timers do need the nanos resolution, the duration unit
can be changed to nanos. The external process that reads the attributes can
correctly interpret the values based on the duration unit.

Thoughts?

- Yifan


Re: [DISCUSS] CASSANDRA-16760 - JMXTimer exposes attributes in inconsistent time units

2021-06-25 Thread Yifan Cai
>
> how much memory the Timers can currently use


Timer is currently backed by a DecayingEstimatedHistogramReservoir. [1]

Each DecayingEstimatedHistogramReservoir defaults to allocate [2]
1. *bucketOffsets*: a long array with the length of 164
2. *decayingBuckets*: a long array with the length of 165 * 2
3. *buckets*: a long array with the length of 165 * 2

Each timer instance consumes 6592 bytes roughly. (Only counting the long
arrays, which are the main contributors)
There are a bunch of timers, per verb, per keyspace, per table, etc.
Although adding them up might still not be a concern.
As mentioned, recording in the micros can halve the memory usage. Not a
significant saving compared with other components, but still good to have
if nanos is not necessary.

The major benefit is making the duration unit consistent.

[1]
https://github.com/apache/cassandra/blob/aac6f7db8c8f493b8e28842903e6e2cb6838ac75/src/java/org/apache/cassandra/metrics/CassandraMetricsRegistry.java#L101
[2]
https://github.com/apache/cassandra/blob/aac6f7db8c8f493b8e28842903e6e2cb6838ac75/src/java/org/apache/cassandra/metrics/DecayingEstimatedHistogramReservoir.java#L79

On Fri, Jun 25, 2021 at 7:26 AM Joshua McKenzie 
wrote:

> +1 to unifying on the same unit for API consistency; micros should be quite
> fine for most if not all of our use-cases.
>
>
> On Fri, Jun 25, 2021 at 8:58 AM Brandon Williams  wrote:
>
> > On Fri, Jun 25, 2021 at 6:17 AM Mick Semb Wever  wrote:
> > >
> > > I'm for (1) if this is for 4.1 only. Changes like this over our annual
> > releases should be fine if they are clearly documented, it's what
> NEWS.txt
> > is for.
> >
> > +1, we have the process in place to handle this.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


Re: [VOTE] Release Apache Cassandra 4.0-rc2

2021-06-28 Thread Yifan Cai
+1


- Yifan

> On Jun 28, 2021, at 8:40 AM, Ekaterina Dimitrova  
> wrote:
> 
> +1 Thanks everyone!
> 
>> On Mon, 28 Jun 2021 at 11:39, Aleksey Yeschenko  wrote:
>> 
>> +1
>> 
 On 28 Jun 2021, at 14:05, Gary Dusbabek  wrote:
>>> 
>>> +1; yay!
>>> 
 On Sun, Jun 27, 2021 at 11:02 AM Mick Semb Wever  wrote:
>>> 
 Proposing the test build of Cassandra 4.0-rc2 for release.
 
 sha1: 4c98576533e1d7663baf447e8877788096489165
 Git:
 
 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-rc2-tentative
 Maven Artifacts:
 
 
>> https://repository.apache.org/content/repositories/orgapachecassandra-1237/org/apache/cassandra/cassandra-all/4.0-rc2/
 
 The Source and Build Artifacts, and the Debian and RPM packages and
 repositories, are available here:
 https://dist.apache.org/repos/dist/dev/cassandra/4.0-rc2/
 
 The vote will be open for 72 hours (longer if needed). Everyone who has
 tested the build is invited to vote. Votes by PMC members are considered
 binding. A vote passes if there are at least three binding +1s and no
>> -1's.
 
 [1]: CHANGES.txt:
 
 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-rc2-tentative
 [2]: NEWS.txt:
 
 
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-rc2-tentative
 [3]: The maven artifacts were accidentally prematurely made public. Docs
 have been updated to prevent this happening again.
 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] CASSANDRA-16760 - JMXTimer exposes attributes in inconsistent time units

2021-06-30 Thread Yifan Cai
Looks like we all agree on option 1. I have submitted a patch to the trunk
branch. It unifies the duration unit and defaults to micros. As a result,
all timers will start to record time values in micros instead of nanos.

Please let me know if there is any concern with the change.

- Yifan

On Fri, Jun 25, 2021 at 1:57 PM Yifan Cai  wrote:

> how much memory the Timers can currently use
>
>
> Timer is currently backed by a DecayingEstimatedHistogramReservoir. [1]
>
> Each DecayingEstimatedHistogramReservoir defaults to allocate [2]
> 1. *bucketOffsets*: a long array with the length of 164
> 2. *decayingBuckets*: a long array with the length of 165 * 2
> 3. *buckets*: a long array with the length of 165 * 2
>
> Each timer instance consumes 6592 bytes roughly. (Only counting the long
> arrays, which are the main contributors)
> There are a bunch of timers, per verb, per keyspace, per table, etc.
> Although adding them up might still not be a concern.
> As mentioned, recording in the micros can halve the memory usage. Not a
> significant saving compared with other components, but still good to have
> if nanos is not necessary.
>
> The major benefit is making the duration unit consistent.
>
> [1]
> https://github.com/apache/cassandra/blob/aac6f7db8c8f493b8e28842903e6e2cb6838ac75/src/java/org/apache/cassandra/metrics/CassandraMetricsRegistry.java#L101
> [2]
> https://github.com/apache/cassandra/blob/aac6f7db8c8f493b8e28842903e6e2cb6838ac75/src/java/org/apache/cassandra/metrics/DecayingEstimatedHistogramReservoir.java#L79
>
> On Fri, Jun 25, 2021 at 7:26 AM Joshua McKenzie 
> wrote:
>
>> +1 to unifying on the same unit for API consistency; micros should be
>> quite
>> fine for most if not all of our use-cases.
>>
>>
>> On Fri, Jun 25, 2021 at 8:58 AM Brandon Williams 
>> wrote:
>>
>> > On Fri, Jun 25, 2021 at 6:17 AM Mick Semb Wever  wrote:
>> > >
>> > > I'm for (1) if this is for 4.1 only. Changes like this over our annual
>> > releases should be fine if they are clearly documented, it's what
>> NEWS.txt
>> > is for.
>> >
>> > +1, we have the process in place to handle this.
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: dev-h...@cassandra.apache.org
>> >
>> >
>>
>


Re: [VOTE] Release Apache Cassandra 4.0.0 (third time is the charm)

2021-07-23 Thread Yifan Cai
+1 (nb)

On Fri, Jul 23, 2021 at 10:32 AM Jon Meredith  wrote:

> +1 (nb)
>
> On Fri, Jul 23, 2021 at 10:22 AM Jake Luciani  wrote:
>
> > +1
> >
> > On Fri, Jul 23, 2021 at 11:31 AM Blake Eggleston
> >  wrote:
> >
> > > +1
> > >
> > > > On Jul 23, 2021, at 6:39 AM, Branimir Lambov <
> > > branimir.lam...@datastax.com> wrote:
> > > >
> > > > +1
> > > >
> > > >> On Fri, Jul 23, 2021 at 4:15 PM Aleksey Yeschenko <
> alek...@apache.org
> > >
> > > >> wrote:
> > > >>
> > > >> +1
> > > >>
> > >  On 23 Jul 2021, at 14:03, Joshua McKenzie 
> > > wrote:
> > > >>>
> > > >>> +1
> > > >>>
> > > >>> On Fri, Jul 23, 2021 at 8:07 AM Dinesh Joshi
> > >  > > >>>
> > > >>> wrote:
> > > >>>
> > >  +1
> > > 
> > > 
> > > > On Jul 23, 2021, at 4:56 AM, Paulo Motta <
> pauloricard...@gmail.com
> > >
> > >  wrote:
> > > >
> > > > +1
> > > >
> > > >> Em sex., 23 de jul. de 2021 às 08:37, Andrés de la Peña <
> > > >> a.penya.gar...@gmail.com> escreveu:
> > > >>
> > > >> +1
> > > >>
> > > >>> On Fri, 23 Jul 2021 at 11:56, Sam Tunnicliffe 
> > > >> wrote:
> > > >>>
> > > >>> +1
> > > >>>
> > >  On 22 Jul 2021, at 23:40, Brandon Williams <
> > >  brandonwilli...@apache.org
> > > >>>
> > > >>> wrote:
> > > 
> > >  I am proposing the test build of Cassandra 4.0.0 for release.
> > > 
> > >  sha1: 902b4d31772eaa84f05ffdc1e4f4b7a66d5b17e6
> > >  Git:
> > > >>>
> > > >>
> > > 
> > > >>
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0.0-tentative
> > >  Maven Artifacts:
> > > 
> > > >>>
> > > >>
> > > 
> > > >>
> > >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1244/org/apache/cassandra/cassandra-all/4.0.0/
> > > 
> > >  The Source and Build Artifacts, and Debian and RPM packages
> and
> > >  repositories are available here:
> > >  https://dist.apache.org/repos/dist/dev/cassandra/4.0.0/
> > > 
> > >  The vote will be open for 72 hours (longer if needed).
> Everyone
> > > who
> > >  has tested the build is invited to vote. Votes by PMC members
> > are
> > >  considered binding. A vote passes if there are at least three
> > > >> binding
> > >  +1s and no -1's.
> > > 
> > >  [1]: CHANGES.txt:
> > > 
> > > >>>
> > > >>
> > > 
> > > >>
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0.0-tentative
> > >  [2]: NEWS.txt:
> > > >>>
> > > >>
> > > 
> > > >>
> > >
> >
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0.0-tentative
> > > 
> > > 
> > > >>
> -
> > >  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >  For additional commands, e-mail:
> dev-h...@cassandra.apache.org
> > > 
> > > >>>
> > > >>>
> > > >>>
> > > -
> > > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>>
> > > >>>
> > > >>
> > > 
> > > 
> > -
> > >  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >  For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > 
> > > 
> > > >>
> > > >>
> > > >>
> -
> > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>
> > > >>
> > > >
> > > > --
> > > > Branimir Lambov
> > > > e. branimir.lam...@datastax.com
> > > > w. www.datastax.com
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
> > --
> > http://twitter.com/tjake
> >
>


Re: Welcome Jon Meredith as Cassandra committer

2021-07-30 Thread Yifan Cai
Congrats Jon!

On Fri, Jul 30, 2021 at 8:48 AM Joshua McKenzie 
wrote:

> Congratulations Jon!
>
>
> On Fri, Jul 30, 2021 at 10:35 AM Andrés de la Peña <
> a.penya.gar...@gmail.com>
> wrote:
>
> > Congratulations, Jon!
> >
> > On Fri, 30 Jul 2021 at 16:07, J. D. Jordan 
> > wrote:
> >
> > > Congrats Jon!
> > >
> > > > On Jul 30, 2021, at 9:26 AM, Paulo Motta 
> > > wrote:
> > > >
> > > > Congratulations and welcome Jon! Always exciting to see the project
> > > > recognizing more committers!
> > > >
> > > >> Em sex., 30 de jul. de 2021 às 11:20, Benjamin Lerer <
> > b.le...@gmail.com
> > > >
> > > >> escreveu:
> > > >>
> > > >> Congratulations Jon. :-)
> > > >>
> > > >> Le ven. 30 juil. 2021 à 15:42, Ekaterina Dimitrova <
> > > e.dimitr...@gmail.com>
> > > >> a écrit :
> > > >>
> > > >>> Congrats!!! Well deserved!!! 🎉 👏🏻
> > > >>>
> > >  On Fri, 30 Jul 2021 at 9:32, Jonathan Ellis 
> > > wrote:
> > > >>>
> > >  Congratulations, Jon!
> > > 
> > >  On Fri, Jul 30, 2021 at 8:29 AM Brandon Williams <
> dri...@gmail.com>
> > > >>> wrote:
> > > 
> > > > The Project Management Committee (PMC) for Apache Cassandra
> > > > has invited Jon Meredith to become a committer and we are pleased
> > > > to announce that he has accepted.
> > > >
> > > > Thanks for all helping make Cassandra great!
> > > >
> > > > Congratulations,
> > > > The Apache Cassandra PMC members
> > > >
> > > >
> > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >
> > > >
> > > 
> > >  --
> > >  Jonathan Ellis
> > >  co-founder, http://www.datastax.com
> > >  @spyced
> > > 
> > > >>>
> > > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>


Re: [DISCUSS] Repair Improvement Proposal

2021-08-26 Thread Yifan Cai
>
> 2. Add retries to specific stages of coordination, such as prepare and
>validate. In order to do these retries we first need to know what the

   state is for the participant which has yet to reply...


If I understand it correctly, does it mean retries only happen in the
coordinator and the coordinator pulls the states of the participants
periodically?
If the handling of the requests in the participant is made to be idempotent
(which I think is required for retry anyway), pulling the state is
unnecessary. For example, the coordinator can just send the PrepareRequest
at regular intervals until it receives the PrepareResponse.

- Yifan

On Thu, Aug 26, 2021 at 8:56 AM Blake Eggleston
 wrote:

> +1 from me, any improvement in this area would be great.
>
> It would be nice if this could include visibility into repair streams, but
> just exposing the repair state will be a big improvement.
>
> > On Aug 25, 2021, at 5:46 PM, David Capwell  wrote:
> >
> > Now that 4.0 is out, I want to bring up improving repair again (earlier
> > thread
> >
> http://mail-archives.apache.org/mod_mbox/cassandra-commits/201911.mbox/%3cjira.13266448.1572997299000.99567.1572997440...@atlassian.jira%3E
> ),
> > specifically the following two JIRAs:
> >
> >
> > CASSANDRA-15566 - Repair coordinator can hang under some cases
> >
> > CASSANDRA-15399 - Add ability to track state in repair
> >
> >
> > Right now repair has an issue if any message is lost, which leads to hung
> > or timed out repairs; in addition there is a large lack of visibility
> into
> > what is going on, and can be even harder if you wish to join coordinator
> > with participant state.
> >
> >
> > I propose the following changes to improve our current repair subsystem:
> >
> >
> >
> >   1. New tracking system for coordinator and participants (covered by
> >   CASSANDRA-15399).  This system will expose progress on each instance
> and
> >   expose this information for internal access as well as external users
> >   2. Add retries to specific stages of coordination, such as prepare and
> >   validate.  In order to do these retries we first need to know what the
> >   state is for the participant which has yet to reply, this will leverage
> >   CASSANDRA-15399 to see what's going on (has the prepare been seen?  Is
> >   validation running? Did it complete?).  In addition to checking the
> >   state, we will need to store the validation MerkleTree, this allows for
> >   coordinator to fetch if goes missing (can be dropped in route to
> >   coordinator or even on the coordinator).
> >
> >
> > What is not in scope?
> >
> >   - Rewriting all of Repair; the idea is specific "small" changes can fix
> >   80% of the issues
> >   - Handle coordinator node failure.  Being able to recover from a failed
> >   coordinator should be possible after the above work is done, so is
> seen as
> >   tangental and can be done later
> >   - Recovery from a downed participant.  Similar to the previous bullet,
> >   with the state being tracked this acts as a kind of checkpoint, so
> future
> >   work can come in to handle recovery
> >   - Handling "too large" range. Ideally we should add an ability to split
> >   the coordination into sub repairs, but this is not the goal of this
> work.
> >   - Overstreaming.  This is a byproduct of the previous "not in scope"
> >   bullet, and/or large partitions; so is tangental to this work
> >
> >
> > Wanted to share here before starting this work again; let me know if
> there
> > are any concerns or feedback!
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 4.0.1

2021-09-01 Thread Yifan Cai
+1


- Yifan

> On Sep 1, 2021, at 7:57 AM, C. Scott Andreas  wrote:
> 
> +1nb
> 
>> On Sep 1, 2021, at 6:54 AM, Jeff Jirsa  wrote:
>> 
>> +1
>> 
>> 
 On Wed, Sep 1, 2021 at 4:54 AM Sam Tunnicliffe  wrote:
>>> 
>>> Proposing the test build of Cassandra 4.0.1 for release.
>>> 
>>> sha1: 6709111ed007a54b3e42884853f89cabd38e4316
>>> Git:
>>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0.1-tentative
>>> Maven Artifacts:
>>> https://repository.apache.org/content/repositories/orgapachecassandra-1247/org/apache/cassandra/cassandra-all/4.0.1/
>>> 
>>> The Source and Build Artifacts, and the Debian and RPM packages and
>>> repositories, are available here:
>>> https://dist.apache.org/repos/dist/dev/cassandra/4.0.1/
>>> 
>>> The vote will be open for 72 hours (longer if needed). Everyone who has
>>> tested the build is invited to vote. Votes by PMC members are considered
>>> binding. A vote passes if there are at least three binding +1s and no -1's.
>>> 
>>> [1]: CHANGES.txt:
>>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0.1-tentative
>>> [2]: NEWS.txt:
>>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0.1-tentative
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release dtest-api 0.0.9

2021-09-09 Thread Yifan Cai
+1


- Yifan

> On Sep 9, 2021, at 3:04 PM, Paulo Motta  wrote:
> 
> +1
> 
>> On Thu, 9 Sep 2021 at 18:38 Ekaterina Dimitrova 
>> wrote:
>> 
>> +1
>> 
>> On Thu, 9 Sep 2021 at 17:34, C. Scott Andreas 
>> wrote:
>> 
>>> +1nb
>>> 
 
 On Sep 9, 2021, at 2:03 PM, David Capwell 
>>> wrote:
 
 +1
 
> On Sep 9, 2021, at 1:52 PM, Mick Semb Wever  wrote:
> 
> +1
> 
>> On Thu, 2 Sept 2021 at 13:20, Mick Semb Wever 
>> wrote:
>> 
>> Proposing the test build of in-jvm dtest API 0.0.9 for release.
>> 
>> Repository:
>>> 
>> https://gitbox.apache.org/repos/asf?p=cassandra-in-jvm-dtest-api.git;a=shortlog;h=refs/tags/0.0.9
>> 
>> Candidate SHA:
>>> 
>> https://github.com/apache/cassandra-in-jvm-dtest-api/commit/aa25319c3e0f506d19383db31d2974a7f5c58ab8
>> tagged with 0.0.9
>> 
>> Artifacts:
>>> 
>> https://repository.apache.org/content/repositories/orgapachecassandra-1248/org/apache/cassandra/dtest-api/0.0.9/
>> 
>> Key signature: A4C465FEA0C552561A392A61E91335D77E3E87CB
>> 
>> 
>> Changes since last release:
>> * CASSANDRA-16803
>> jvm-dtest-upgrade failing
>> MixedModeReadTest.mixedModeReadColumnSubsetDigestCheck,
>> ClassNotFoundException: com.vdurmont.semver4j.Semver
>> 
>> 
>> The vote will be open for 24 hours. Everyone who has tested the build
>> is invited to vote. Votes by PMC members are considered binding. A
>> vote passes if there are at least three binding +1s.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org
 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Welcome Sumanth Pasupuleti as Apache Cassandra Committer

2021-11-05 Thread Yifan Cai
Congratulations Sumanth!

- Yifan

> On Nov 5, 2021, at 11:37 AM, Patrick McFadin  wrote:
> 
> Great to see this. Congrats Sumanth!
> 
>> On Fri, Nov 5, 2021 at 11:34 AM Brandon Williams  wrote:
>> 
>> Congratulations Sumanth!
>> 
>>> On Fri, Nov 5, 2021 at 1:17 PM Oleksandr Petrov
>>>  wrote:
>>> 
>>> The PMC members are pleased to announce that Sumanth Pasupuleti has
>>> recently accepted the invitation to become committer.
>>> 
>>> Sumanth, thank you for all your contributions to the project over the
>> years.
>>> 
>>> Congratulations and welcome!
>>> 
>>> The Apache Cassandra PMC members
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Formalizing our CI process

2022-01-10 Thread Yifan Cai
Would you like to elaborate on when to run the "canonical set of tests" and
when to run the others?

If my understanding is correct, we run the canonical set *before* merging,
and the runs triggered by the cassandra CI bot include the full set *after*
a commit is merged.

- Yifan

On Mon, Jan 10, 2022 at 2:37 PM Jeremiah D Jordan 
wrote:

> +1 nb
>
> On Jan 10, 2022, at 1:00 PM, Joshua McKenzie  wrote:
>
> Wiki draft article here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=199530280
>
> The vote will be open for 72 hours (it's short + early indication on
> discussion was consensus).
> Committer  / pmc votes binding.
> Simple majority passes.
>
> References:
> Background: original ML thread here:
> https://lists.apache.org/thread/bq470ml17g106pwxpvwgws2stxc6d7b9
> Project governance guidelines here:
> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance
>
> ~Josh
>
>
>


Re: [VOTE] Formalizing our CI process

2022-01-11 Thread Yifan Cai
+1

On Tue, Jan 11, 2022 at 7:02 AM Andrés de la Peña 
wrote:

> +1
>
> On Tue, 11 Jan 2022 at 13:45, Joshua McKenzie 
> wrote:
>
>> If my understanding is correct, we run the canonical set *before* merging,
>>> and the runs triggered by the cassandra CI bot include the full set
>>> *after* a commit is merged.
>>
>> Good point. Clarified to indicate it's canonical *circleci tests* to run
>> before merging, and the ci-cassandra jenkins is canonical post excepting
>> release blocking.
>>
>> small nits
>>
>> Tweaked those two little bits as those are non-controversial.
>>
>> All three edits are clarifications and not changes so no change to the
>> vote.
>>
>> Keep the feedback coming!
>>
>> ~Josh
>>
>> On Tue, Jan 11, 2022 at 8:27 AM Mick Semb Wever  wrote:
>>
>>>
>>>
>>> On Mon, 10 Jan 2022 at 20:00, Joshua McKenzie 
>>> wrote:
>>>
 Wiki draft article here:
 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=199530280

>>>
>>>
>>> +1
>>>
>>> small nits:
>>> - none of the other confluence pages use the "Apache" prefix
>>> - can it reference the CI Systems page please
>>> - post-vote: maybe the release process doc needs an update?
>>> https://cassandra.apache.org/doc/latest/development/release_process.html
>>>
>>>
>>>


Re: [VOTE] Formalizing our CI process

2022-01-12 Thread Yifan Cai
>
> "All releases by default are expected to have a green test run on
> ci-cassandra Jenkins. In exceptional circumstances (security incidents,
> data loss, etc requiring hotfix), members with binding votes on a release
> may choose to approve a release with known failing tests."


+1 with the amendment.

On Wed, Jan 12, 2022 at 10:01 AM Ekaterina Dimitrova 
wrote:

>  +1 w/ Joey's amendment
>
> On Wed, 12 Jan 2022 at 13:00, Michael Shuler 
> wrote:
>
>> (still) +1 as amended
>>
>> Michael
>>
>> On 1/12/22 11:54, Caleb Rackliffe wrote:
>> > +1 w/ Joey's amendment
>> >
>> > On Wed, Jan 12, 2022 at 11:04 AM Joshua McKenzie > > > wrote:
>> >
>> > I'd say an amendment with a directional poll would be fine. I don't
>> > think this is controversial.
>> >
>> > That's me taking "the spirit of the law" rather than the letter
>> > though. I'm good either way.
>> >
>> > ~Josh
>> >
>> > On Wed, Jan 12, 2022 at 11:51 AM Joseph Lynch <
>> joe.e.ly...@gmail.com
>> > > wrote:
>> >
>> > On Wed, Jan 12, 2022 at 11:43 AM Joshua McKenzie
>> > mailto:jmcken...@apache.org>> wrote:
>> >  >
>> >  > I fully concede your point and concern Joey but I propose we
>> > phrase that differently to emphasize the importance of clean
>> tests.
>> >  >
>> >  > "All releases by default are expected to have a green test
>> > run on ci-cassandra Jenkins. In exceptional circumstances
>> > (security incidents, data loss, etc requiring hotfix), members
>> > with binding votes on a release may choose to approve a release
>> > with known failing tests."
>> >
>> > I like the balance that strikes. Should we re-vote or should I
>> > propose
>> > that text as an amendment after this vote (since a simple
>> majority
>> > will likely be reached)?
>> >
>> > -Joey
>> >
>>
>


Re: UDF future

2022-01-19 Thread Yifan Cai
>
> I think we should deprecate scripted UDFs now and drop them from the next
> major, but possibly provide hooks for people to write their own UDF
> "engines" and break out the current javascript implementation in to its own
> repository (but not ship it with Cassandra).


+1

Just want to clarify, is the scripted UDF the one defined using javascript?




On Wed, Jan 19, 2022 at 9:41 AM Francisco Guerrero 
wrote:

> +1 (nb)
>
> On 2022/01/19 15:10:20 Brandon Williams wrote:
> > We can for completeness, but even with twice as much usage reported as
> the
> > other methods, I don't think it will affect the outcome of the vote.
> >
> > On Wed, Jan 19, 2022, 7:25 AM Paulo Motta 
> wrote:
> >
> > > This proposal looks good to me, +1. I was wondering if we should not
> run
> > > this proposal on the user@ list to check if there's any additional
> > > feedback in addition to the informal Twitter and Linkedin channels?
> > >
> > > Em qua., 19 de jan. de 2022 às 10:18, Sylwester Lachiewicz <
> > > slachiew...@gmail.com> escreveu:
> > >
> > >> +1 (Nb)
> > >>
> > >> śr., 19 sty 2022, 12:31 użytkownik Brandon Williams  >
> > >> napisał:
> > >>
> > >>> +1
> > >>>
> > >>> On Tue, Jan 18, 2022 at 10:30 AM Ekaterina Dimitrova
> > >>>  wrote:
> > >>> >
> > >>> > Hi everyone,
> > >>> >
> > >>> > With the work to add Java 17 support for Cassandra, a new question
> > >>> around the future of UDF was raised. The scripted UDF was using
> Nashorn
> > >>> which is no longer packaged with the JDK. There are options to add
> new
> > >>> dependencies to Graal JS for example but it seems people are not
> sure that
> > >>> it is worth it. Please check the discussion on CASSANDRA-16895.
> > >>> >
> > >>> > The following suggestion was made by Marcus and supported by other
> PMC
> > >>> members - "I think we should deprecate scripted UDFs now and drop
> them from
> > >>> the next major, but possibly provide hooks for people to write their
> own
> > >>> UDF "engines" and break out the current javascript implementation in
> to its
> > >>> own repository (but not ship it with Cassandra)."
> > >>> >
> > >>> > As a result we decided to engage with our users and created a
> Twitter
> > >>> survey. Results below:
> > >>> >
> > >>> > We would love to understand how you use ApacheCassandra UDFs and
> UDAs.
> > >>> >
> > >>> > 32 people responded as follows:
> > >>> >
> > >>> > We do not use them - 75%
> > >>> > We only use Java UDFs - 22%
> > >>> > We only use JS UDFs - 0%
> > >>> > We use Java and JS UDFs - 3%
> > >>> >
> > >>> > We also received feedback on LinkedIN on the topic -
> > >>>
> https://www.linkedin.com/feed/update/urn:li:activity:6886728406742970369?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A6886728406742970369%2C6886793921020608512%29&replyUrn=urn%3Ali%3Acomment%3A%28activity%3A6886728406742970369%2C6887421509485248512%29
> > >>> >
> > >>> >
> > >>> > Thoughts?
> > >>> >
> > >>> > Best regards,
> > >>> > Ekaterina
> > >>>
> > >>
> >
>


Re: [VOTE] Release Apache Cassandra 4.0.2

2022-02-10 Thread Yifan Cai
+1 on the release

On Thu, Feb 10, 2022 at 7:23 AM Ekaterina Dimitrova 
wrote:

> +0nb
> I am not sure I am getting enough information from our CI to vote for
> either +1 or -1. I was chasing CI issues two days, being worried did I
> break something with CCM change I introduced over the weekend as CI started
> hanging in a weird way. (If I knew there will be a release I wouldn’t have
> committed change to ccm…) At the end I reproduced some of the issues
> causing the CI to hang with the CCM version prior my changes in Circle CI.
> On the bright side, I tested all branches one more time in Circle CI the
> other day and I got confirmation on the current state.
>
> Release or not I think anyway we have material to think/work on as a
> community
>
> On Thu, 10 Feb 2022 at 4:07, Tommy Stendahl 
> wrote:
>
>> +1 nb
>>
>> On Mon, 2022-02-07 at 15:14 +0100, Mick Semb Wever wrote:
>>
>> Proposing the test build of Cassandra 4.0.2 for release.
>>
>> sha1: 25012d2fec1984cc9c1a352f214eb912ca4f10f5
>>
>> Git:
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0.2-tentative
>>
>> Maven Artifacts:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1255/org/apache/cassandra/cassandra-all/4.0.2/
>>
>> The Source and Build Artifacts, and the Debian and RPM packages and
>> repositories, are available here:
>> https://dist.apache.org/repos/dist/dev/cassandra/4.0.2/
>>
>> The vote will be open for 72 hours (longer if needed). Everyone who has
>> tested the build is invited to vote. Votes by PMC members are considered
>> binding. A vote passes if there are at least three binding +1s and no -1's.
>>
>> [1]: CHANGES.txt:
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0.2-tentative
>> [2]: NEWS.txt:
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0.2-tentative
>>
>>


Re: Using labels on pull requests in GitHub

2022-03-16 Thread Yifan Cai
Thank you Stefan for all the efforts!

Regarding the "merge strategy change", should we start a new thread?

I am +1 on adopting the merge button. It should work in the single branch
commit. Just the cross branch commit could be tricky.

- Yifan


Re: Welcome Aleksandr Sorokoumov as Cassandra committer

2022-03-16 Thread Yifan Cai
Congratulations Aleksandr!

On Wed, Mar 16, 2022 at 7:34 AM Andrés de la Peña 
wrote:

> Congrats, well deserved!
>
> On Wed, 16 Mar 2022 at 14:01, J. D. Jordan 
> wrote:
>
>> Congratulations!
>>
>> On Mar 16, 2022, at 8:43 AM, Ekaterina Dimitrova 
>> wrote:
>>
>> 
>> Great news! Well deserved! Congrats and thank you for all your support!
>>
>> On Wed, 16 Mar 2022 at 9:41, Paulo Motta  wrote:
>>
>>> Congratulations Alex, well deserved! :-)
>>>
>>> Em qua., 16 de mar. de 2022 às 10:15, Benjamin Lerer 
>>> escreveu:
>>>
 The PMC members are pleased to announce that Aleksandr Sorokoumov has
 accepted
 the invitation to become committer.

 Thanks a lot, Aleksandr , for everything you have done for the project.

 Congratulations and welcome

 The Apache Cassandra PMC members

>>>


Re: Updating our Code Contribution/Style Guide

2022-03-16 Thread Yifan Cai
+1 to the guideline.


> > For the instance() / getInstance() methods - I know it is an additional
> effort, but on the other hand it has many advantages because you can
> replace the singleton for testing
>
> Again, do this as necessary. I think for public instances this is a fine
> recommendation, but for private uses it should not be prescribed, only used
> if there is an explicit benefit.


It is regarding testability. Where mock is desired, there should be
getter methods, instead of 'public final'. Otherwise, `public final` is
preferred for its simplicity.
It is more tricky in terms of singletons though. I feel there is no good
use of private singleton, which is ugly and makes the referencing code
difficult to test. So probably for singleton, we want to declare the
'instance()' method.
It is good that the guideline is not rigid.


I don’t think it is good idea to prohibit or discourage to use final, which
> is a tool to guard immutability.

Ruslan,
What is proposed is to prohibit or discourage the use of `final` *within a
method body*. I think it is less useful to mark a variable's *reference* as
being immutable within such scope. In the other scenario, e.g. class member
fields, `final` should be used when reference/primitive immutability is
desired.

- Yifan

On Tue, Mar 15, 2022 at 3:46 AM Ruslan Fomkin 
wrote:

> Hi,
>
> I hope it’s OK I jump to the discussion.
>
> I find it is important to automate code formatting and have a build check
> to verify it, otherwise there are many examples in other projects that
> formatting is not followed. To make formatting to be not painful for
> contributors it will be good to setup git commit hooks (which will require
> to have a command line formatting tool) in addition to IDE support. In such
> case the main task for the formatting CI build check will be to fail
> environments, which are not yet set.
> For example, cassandra-dtest already has a CI formatting check in place
> for Python code, which runs on each PR. There is a Python formatting
> command line tool, which can be easily run locally, and if I don’t mistake
> it is easy to setup git commit hook with it. (also works to setup the
> formatting in VScode)
>
> I don’t think it is good idea to prohibit or discourage to use final,
> which is a tool to guard immutability. As mentioned unfortunately Java is
> not designed to be safe by default and thus makes code more noisy by
> requiring to use the keyword.
>
> I noticed an issue with current formatting that there is no indentation if
> an assignment statement is split to multiple lines before or without using
> parenthesis. For example:
> ImmutableMap.Builder InetAddressAndPort>> dcRackBuilder =
> ImmutableMap.builder();
> It would be nice if the next line is intended to understand that it is
> part of the previous line.
>
> I support Jacek’s request to have each argument on a separate line when
> they are many and need to be placed on multiple lines. For me it takes less
> effort to grasp arguments on separate lines than when several arguments are
> combined on the same line. IMHO the root cause is having too many
> arguments, which is common issue for non-OOP languages.
>
> Best regards,
> Ruslan Fomkin
>
> On 15 Mar 2022, at 10:04, Stefan Miklosovic <
> stefan.mikloso...@instaclustr.com> wrote:
>
> I agree with the single commit approach to fix it all. TBH Javadocs
> are a little bit messy as well, warnings on generating them,
> incomplete, in a lot of cases obsolete or they do not reflect the code
> anymore etc.
>
> On Tue, 15 Mar 2022 at 09:44, bened...@apache.org 
> wrote:
>
>
> I’d be fine with that, though I think if we want to start enforcing
> imports we probably want to mass correct them first. It’s not like other
> style requirements in that there should not be unintended consequences. A
> single (huge) commit to standardise the orders and introduce a build-time
> check would be fine IMO.
>
>
>
> I also don’t really think it is that important.
>
>
>
> From: Jacek Lewandowski 
> Date: Tuesday, 15 March 2022 at 05:18
> To: dev@cassandra.apache.org 
> Subject: Re: Updating our Code Contribution/Style Guide
>
> I do think that we should at least enforce the import order. What is now
> is a complete mess and causes a lot of conflicts during rebasing / merging.
> Perhaps we could start enforcing such rules only on modified files, this
> way we could gradually go towards consistency... wdyt?
>
>
> - - -- --- -  -
> Jacek Lewandowski
>
>
>
>
>
> On Tue, Mar 15, 2022 at 1:52 AM Dinesh Joshi  wrote:
>
> Benedict, I agree. We should not be rigid about applying any style.
> stylechecks are meant to bring uniformity in the codebase. I assure you
> what I am proposing is neither rigid nor curbs the ability to apply the
> rules flexibly.
>
>
>
> On Mar 14, 2022, at 4:52 PM, bened...@apache.org wrote:
>
>
>
> I’m a strong -1 on strictly enforcing any style guide. It is there to help
> shape contributions, review feedback and responding 

Re: Welcome Jacek Lewandowski as Cassandra committer

2022-07-06 Thread Yifan Cai
Congrats, Jacek!

From: C. Scott Andreas 
Sent: Wednesday, July 6, 2022 8:26:26 AM
To: dev@cassandra.apache.org 
Cc: dev@cassandra.apache.org 
Subject: Re: Welcome Jacek Lewandowski as Cassandra committer

Congratulations, Jacek!

On Jul 6, 2022, at 7:38 AM, Mick Semb Wever  wrote:


Congrats Jacek!

On Wed, 6 Jul 2022 at 15:00, Ekaterina Dimitrova 
mailto:e.dimitr...@gmail.com>> wrote:
Well deserved, congrats! 🎉

On Wed, 6 Jul 2022 at 8:56, Brandon Williams 
mailto:dri...@gmail.com>> wrote:
Congrats!

On Wed, Jul 6, 2022, 7:00 AM Benjamin Lerer 
mailto:ble...@apache.org>> wrote:
The PMC members are pleased to announce that  Jacek Lewandowski has accepted
the invitation to become committer.

Thanks a lot, Jacek,  for everything you have done!

Congratulations and welcome

The Apache Cassandra PMC members


Re: [DISCUSS] Modeling JIRA fix version for subprojects

2024-10-18 Thread Yifan Cai
That is good to know.

My preference would be moving the driver projects and analytics first, and
renaming the sidecar project later. However, if it is not a concern, I am
fine with doing the change in bulk, especially if it is more convenient.

- Yifan

On Fri, Oct 18, 2024 at 11:24 AM Jon Haddad  wrote:

> I'm 95% sure it redirects automatically.  I believe those redirects also
> work when moving issues from one project to another, so when we move all
> the driver issues to their own repos everything should keep working.
>
>
>
> On Fri, Oct 18, 2024 at 11:21 AM Francisco Guerrero 
> wrote:
>
>> This is one thing that comes to mind.
>>
>> All of the commits in Sidecar have a reference to the JIRA, if we switch
>> it from CASSANDRASC -> CASS-SIDECAR, will there he redirection of
>> these tickets? Or will be lose those links and the ability to
>> automatically
>> refer to the JIRAs?
>>
>> On 2024/10/18 18:17:16 Yifan Cai wrote:
>> > Anyone know if there are any traps when renaming a JIRA project? Since
>> we
>> > are talking about change CASSANDRASC to CASS-SIDECAR.
>> >
>> >
>> > - Yifan
>> >
>> > On Fri, Oct 18, 2024 at 10:50 AM Patrick McFadin 
>> wrote:
>> >
>> > > Awesome. Now this is the bikeshedding I'm here for. :popcorn:
>> > >
>> > > On Fri, Oct 18, 2024 at 5:26 AM Brandon Williams 
>> wrote:
>> > >
>> > >> I think for the sidecar we need the dash to avoid the typo-likely
>> > >> 'SSS' wart. I have updated INFRA-26212 to follow CASS-DRIVER-.
>> > >>
>> > >> Kind Regards,
>> > >> Brandon
>> > >>
>> > >> On Fri, Oct 18, 2024 at 4:55 AM Mick Semb Wever 
>> wrote:
>> > >> >
>> > >> > Could we also rename CASSANDRASC to CASSSIDECAR ?
>> > >> >
>> > >> > And tbh, i'd much rather CASS-DRIVER-, CASS-ANALYTICS and
>> > >> CASS-SIDECAR
>> > >> >  (but ofc it's not really important)
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Thu, 17 Oct 2024 at 23:11, Štefan Miklošovič <
>> smikloso...@apache.org>
>> > >> wrote:
>> > >> >>
>> > >> >> Interesting, adding "-ana" suffix to a dc which is meant to be an
>> > >> analytical one was pretty common in cases I saw. People just want to
>> look
>> > >> at it and see the difference, how do they name it then? dc2? Also,
>> omitting
>> > >> 2 letters does not seem like a typo to me either.
>> > >> >>
>> > >> >>
>> > >> >> Anyway, we are clearly after the other so CASSANALYTICS be it.
>> > >> >>
>> > >> >>
>> > >> >> On Thu, Oct 17, 2024 at 3:03 PM Jon Haddad <
>> j...@rustyrazorblade.com>
>> > >> wrote:
>> > >> >>>
>> > >> >>> +1 to CASSANALYTICS
>> > >> >>>
>> > >> >>> a fierce -1 to CASSANA. I’ve never once in my life seen this
>> > >> convention and id prefer clarity over saving a handful of characters.
>> > >> >>>
>> > >> >>> —
>> > >> >>> Jon Haddad
>> > >> >>> Rustyrazorblade Consulting
>> > >> >>> rustyrazorblade.com
>> > >> >>>
>> > >> >>>
>> > >> >>> On Thu, Oct 17, 2024 at 1:57 PM Štefan Miklošovič <
>> > >> smikloso...@apache.org> wrote:
>> > >> >>>>
>> > >> >>>> CASSANA would do it. CASSANALYTICS is just too long. People are
>> used
>> > >> to the terminology of "ana" e.g. when naming their analytics data
>> centers.
>> > >> >>>>
>> > >> >>>> On Thu, Oct 17, 2024 at 2:55 PM Bernardo Botella <
>> > >> conta...@bernardobotella.com> wrote:
>> > >> >>>>>
>> > >> >>>>> +1 to CASSANALYTICS
>> > >> >>>>>
>> > >> >>>>> On Oct 17, 2024, at 1:48 PM, Yifan Cai 
>> wrote:
>> > >> >>>>>
>> > >> >>>>> yep. CASSANALYTICS sounds good to me. +1
>> > >> >>>>>

Re: [DISCUSS] Modeling JIRA fix version for subprojects

2024-10-18 Thread Yifan Cai
Anyone know if there are any traps when renaming a JIRA project? Since we
are talking about change CASSANDRASC to CASS-SIDECAR.


- Yifan

On Fri, Oct 18, 2024 at 10:50 AM Patrick McFadin  wrote:

> Awesome. Now this is the bikeshedding I'm here for. :popcorn:
>
> On Fri, Oct 18, 2024 at 5:26 AM Brandon Williams  wrote:
>
>> I think for the sidecar we need the dash to avoid the typo-likely
>> 'SSS' wart. I have updated INFRA-26212 to follow CASS-DRIVER-.
>>
>> Kind Regards,
>> Brandon
>>
>> On Fri, Oct 18, 2024 at 4:55 AM Mick Semb Wever  wrote:
>> >
>> > Could we also rename CASSANDRASC to CASSSIDECAR ?
>> >
>> > And tbh, i'd much rather CASS-DRIVER-, CASS-ANALYTICS and
>> CASS-SIDECAR
>> >  (but ofc it's not really important)
>> >
>> >
>> >
>> > On Thu, 17 Oct 2024 at 23:11, Štefan Miklošovič 
>> wrote:
>> >>
>> >> Interesting, adding "-ana" suffix to a dc which is meant to be an
>> analytical one was pretty common in cases I saw. People just want to look
>> at it and see the difference, how do they name it then? dc2? Also, omitting
>> 2 letters does not seem like a typo to me either.
>> >>
>> >>
>> >> Anyway, we are clearly after the other so CASSANALYTICS be it.
>> >>
>> >>
>> >> On Thu, Oct 17, 2024 at 3:03 PM Jon Haddad 
>> wrote:
>> >>>
>> >>> +1 to CASSANALYTICS
>> >>>
>> >>> a fierce -1 to CASSANA. I’ve never once in my life seen this
>> convention and id prefer clarity over saving a handful of characters.
>> >>>
>> >>> —
>> >>> Jon Haddad
>> >>> Rustyrazorblade Consulting
>> >>> rustyrazorblade.com
>> >>>
>> >>>
>> >>> On Thu, Oct 17, 2024 at 1:57 PM Štefan Miklošovič <
>> smikloso...@apache.org> wrote:
>> >>>>
>> >>>> CASSANA would do it. CASSANALYTICS is just too long. People are used
>> to the terminology of "ana" e.g. when naming their analytics data centers.
>> >>>>
>> >>>> On Thu, Oct 17, 2024 at 2:55 PM Bernardo Botella <
>> conta...@bernardobotella.com> wrote:
>> >>>>>
>> >>>>> +1 to CASSANALYTICS
>> >>>>>
>> >>>>> On Oct 17, 2024, at 1:48 PM, Yifan Cai  wrote:
>> >>>>>
>> >>>>> yep. CASSANALYTICS sounds good to me. +1
>> >>>>>
>> >>>>> On Thu, Oct 17, 2024 at 1:45 PM Francisco Guerrero <
>> fran...@apache.org> wrote:
>> >>>>>>
>> >>>>>> > Can we include Cassandra Analytics to the infra ticket? I am
>> looking
>> >>>>>> > forward to jira project name suggestions for it...
>> >>>>>>
>> >>>>>> How about CASSANALYTICS ?
>> >>>>>>
>> >>>>>> On 2024/10/17 18:50:45 Yifan Cai wrote:
>> >>>>>> > Can we include Cassandra Analytics to the infra ticket? I am
>> looking
>> >>>>>> > forward to jira project name suggestions for it...
>> >>>>>> >
>> >>>>>> > - Yifan
>> >>>>>> >
>> >>>>>> > On Thu, Oct 17, 2024 at 10:46 AM Patrick McFadin <
>> pmcfa...@gmail.com> wrote:
>> >>>>>> >
>> >>>>>> > > I think it needs a bit more blue. Maybe some pink stripes.
>> I'll file a
>> >>>>>> > > Jira.
>> >>>>>> > >
>> >>>>>> > > On Thu, Oct 17, 2024 at 9:01 AM Brandon Williams <
>> dri...@gmail.com> wrote:
>> >>>>>> > >
>> >>>>>> > >> Thanks everyone, I've created
>> >>>>>> > >> https://issues.apache.org/jira/browse/INFRA-26212
>> >>>>>> > >>
>> >>>>>> > >> Kind Regards,
>> >>>>>> > >> Brandon
>> >>>>>> > >>
>> >>>>>> > >> On Thu, Oct 17, 2024 at 9:55 AM Ekaterina Dimitrova
>> >>>>>> > >>  wrote:
>> >>>>>> > >> >
>> >>>>>> > >> > It w

Re: [DISCUSS] Introduce CREATE TABLE LIKE grammer

2024-10-16 Thread Yifan Cai
"WITH ALL" seems to be a natural addition to the directives. What do you
think about adding the fifth keyword ALL to retain all fields of the table
schema?

For instance, CREATE TABLE new_table LIKE original_table WITH ALL, it
replicates options, indexes, triggers, constraints and any applicable kinds
that are introduced in the future.

- Yifan

On Wed, Oct 16, 2024 at 7:46 AM guo Maxwell  wrote:

> Disscussed with Bernardo on slack,and +1 with his advice on adding a
> fourth keyword.
>
> The keyword would be  CONSTRAINTS , any more suggestion ?
>
> guo Maxwell 于2024年10月16日 周三上午9:55写道:
>
>> Hi yifan,
>> Thanks for bringing this up. The SELECT permission on the original table
>> is needed. Mysql and PG all have mentioned this, and I also specifically
>> noticed this in my code.
>>
>> I probably missed this in the cep documentation. 😅
>>
>> Yifan Cai  于2024年10月16日周三 07:46写道:
>>
>>> Thanks for creating the CEP! I think it is missing Bernardo's comment on
>>> "the need for read permissions on the source table".
>>>
>>> CreateTableStatement does not check the permissions outside of the
>>> enclosing keyspace. Having the SELECT permission on the original table is a
>>> requirement for CREATE TABLE LIKE.
>>>
>>> - Yifan
>>>
>>> On Sun, Sep 29, 2024 at 11:01 PM guo Maxwell 
>>> wrote:
>>>
>>>> Hello, everyone ,
>>>> I have finished the doc for CEP-43 for CREATE_TABLE_LIKE
>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>  as
>>>> said before, looking forward to your suggestions.
>>>>
>>>> Štefan Miklošovič  于2024年9月25日周三 03:51写道:
>>>>
>>>>> I am sorry I do not follow what you mean, maybe an example would help.
>>>>>
>>>>> On Tue, Sep 24, 2024 at 6:18 PM guo Maxwell 
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> If there are multiple schema information changes in one ddl
>>>>>> statement, will there be schema conflicts in extreme cases?
>>>>>> For example, our statement contains both table creation and index
>>>>>> creation.
>>>>>>
>>>>>> guo Maxwell 于2024年9月24日 周二下午8:12写道:
>>>>>>
>>>>>>> +1 on splitting this task  and adding the ability to copy tables
>>>>>>> through different keyspaces in the future.
>>>>>>>
>>>>>>> Štefan Miklošovič  于2024年9月23日周一 22:05写道:
>>>>>>>
>>>>>>>> If we have this table
>>>>>>>>
>>>>>>>> CREATE TABLE ks.tb2 (
>>>>>>>> id int PRIMARY KEY,
>>>>>>>> name text
>>>>>>>> );
>>>>>>>>
>>>>>>>> I can either specify name of an index on my own like this:
>>>>>>>>
>>>>>>>> CREATE INDEX name_index ON ks.tb2 (name) ;
>>>>>>>>
>>>>>>>> or I can let Cassandra to figure that name on its own:
>>>>>>>>
>>>>>>>> CREATE INDEX ON ks.tb2 (name) ;
>>>>>>>>
>>>>>>>> in that case it will name that index "tb2_name_idx".
>>>>>>>>
>>>>>>>> Hence, I would expect that when we do
>>>>>>>>
>>>>>>>> ALTER TABLE ks.to_copy LIKE ks.tb2 WITH INDICES;
>>>>>>>>
>>>>>>>> Then ks.to_copy table will have an index which is called
>>>>>>>> "to_copy_name_idx" without me doing anything.
>>>>>>>>
>>>>>>>> For types, we do not need to do anything when we deal with the same
>>>>>>>> keyspace. For simplicity, I mentioned that we might deal with the same
>>>>>>>> keyspace scenario only for now and iterate on that in the future.
>>>>>>>>
>>>>>>>> On Mon, Sep 23, 2024 at 8:53 AM guo Maxwell 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello everyone,
>>>>>>>>>
>>>>>>>>> Cep is being written, and I encountered some problems during the
>>>>>>>>> process. I would like to discuss them with you. If you read the 
>>

Re: [VOTE] CEP-44: Kafka integration for Cassandra CDC using Sidecar

2024-10-17 Thread Yifan Cai
+1 nb

From: Brandon Williams 
Sent: Thursday, October 17, 2024 11:47:13 AM
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] CEP-44: Kafka integration for Cassandra CDC using Sidecar

+1

Kind Regards,
Brandon

On Thu, Oct 17, 2024 at 1:08 PM James Berragan  wrote:
>
> Hi everyone,
>
> I would like to start the voting for CEP-44 as all the feedback in the 
> discussion thread seems to be addressed.
>
> Proposal: CEP-44: Kafka integration for Cassandra CDC using Sidecar
> Discussion thread: 
> https://lists.apache.org/thread/8k6njsnvdbmjb6jhyy07o1s7jz8xp1qg
>
> As per the CEP process documentation, this vote will be open for 72 hours 
> (longer if needed).
>
> Thanks!
> James.


Re: [DISCUSS] Introduce CREATE TABLE LIKE grammer

2024-10-17 Thread Yifan Cai
RESSION CONSTRAINTS
>>>GENERATED IDENTITY STATISTICS STORAGE
>>>
>>> Conclusion: If there may be more keywords to consider in the future,
>>> such as more than 4 , I am +1 on adding ALL back .
>>>
>>> To Dave :
>>>Default behavior is only copy column name, data type ,data mask ,
>>> you can see more detail from  CEP-43
>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>> .
>>>
>>>
>>> Patrick McFadin  于2024年10月17日周四 06:43写道:
>>>
>>>> +1 That makes much more sense in my experience.
>>>>
>>>> On Wed, Oct 16, 2024 at 12:12 PM Dave Herrington 
>>>> wrote:
>>>>
>>>>> I'm coming at this with both a deep ANSI SQL background as well as CQL
>>>>> background.
>>>>>
>>>>> Defining the default behavior is the starting point.  What gets copied
>>>>> if we do "CREATE TABLE new_table LIKE original_table;" without a WITH
>>>>> clause?
>>>>>
>>>>> Then, you build on that with the specific WITH options.  WITH ALL
>>>>> catches everything.
>>>>>
>>>>> -Dave
>>>>>
>>>>> On Wed, Oct 16, 2024 at 11:16 AM Yifan Cai  wrote:
>>>>>
>>>>>> "WITH ALL" seems to be a natural addition to the directives. What do
>>>>>> you think about adding the fifth keyword ALL to retain all fields of the
>>>>>> table schema?
>>>>>>
>>>>>> For instance, CREATE TABLE new_table LIKE original_table WITH ALL, it
>>>>>> replicates options, indexes, triggers, constraints and any applicable 
>>>>>> kinds
>>>>>> that are introduced in the future.
>>>>>>
>>>>>> - Yifan
>>>>>>
>>>>>> On Wed, Oct 16, 2024 at 7:46 AM guo Maxwell 
>>>>>> wrote:
>>>>>>
>>>>>>> Disscussed with Bernardo on slack,and +1 with his advice on adding a
>>>>>>> fourth keyword.
>>>>>>>
>>>>>>> The keyword would be  CONSTRAINTS , any more suggestion ?
>>>>>>>
>>>>>>> guo Maxwell 于2024年10月16日 周三上午9:55写道:
>>>>>>>
>>>>>>>> Hi yifan,
>>>>>>>> Thanks for bringing this up. The SELECT permission on the original
>>>>>>>> table is needed. Mysql and PG all have mentioned this, and I also
>>>>>>>> specifically noticed this in my code.
>>>>>>>>
>>>>>>>> I probably missed this in the cep documentation. 😅
>>>>>>>>
>>>>>>>> Yifan Cai  于2024年10月16日周三 07:46写道:
>>>>>>>>
>>>>>>>>> Thanks for creating the CEP! I think it is missing Bernardo's
>>>>>>>>> comment on "the need for read permissions on the source table".
>>>>>>>>>
>>>>>>>>> CreateTableStatement does not check the permissions outside of the
>>>>>>>>> enclosing keyspace. Having the SELECT permission on the original 
>>>>>>>>> table is a
>>>>>>>>> requirement for CREATE TABLE LIKE.
>>>>>>>>>
>>>>>>>>> - Yifan
>>>>>>>>>
>>>>>>>>> On Sun, Sep 29, 2024 at 11:01 PM guo Maxwell 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello, everyone ,
>>>>>>>>>> I have finished the doc for CEP-43 for CREATE_TABLE_LIKE
>>>>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>>>>>>  as
>>>>>>>>>> said before, looking forward to your suggestions.
>>>>>>>>>>
>>>>>>>>>> Štefan Miklošovič  于2024年9月25日周三 03:51写道:
>>>>>>>>>>
>>>>>>>>>>> I am sorry I do not follow what you mean, maybe an example would
>>>>>>>>>>> help.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Sep 24, 2024 at 6:18 PM guo Maxwell <
>>>>>>>&g

Re: [DISCUSS] Modeling JIRA fix version for subprojects

2024-10-17 Thread Yifan Cai
Can we include Cassandra Analytics to the infra ticket? I am looking
forward to jira project name suggestions for it...

- Yifan

On Thu, Oct 17, 2024 at 10:46 AM Patrick McFadin  wrote:

> I think it needs a bit more blue. Maybe some pink stripes. I'll file a
> Jira.
>
> On Thu, Oct 17, 2024 at 9:01 AM Brandon Williams  wrote:
>
>> Thanks everyone, I've created
>> https://issues.apache.org/jira/browse/INFRA-26212
>>
>> Kind Regards,
>> Brandon
>>
>> On Thu, Oct 17, 2024 at 9:55 AM Ekaterina Dimitrova
>>  wrote:
>> >
>> > It would have been nice to be in red italic but… :-)
>> >
>> > Thanks, Brandon, +1 to the suggestion on my end too. Sounds reasonable
>> to me
>> >
>> >
>> > On Thu, 17 Oct 2024 at 17:50, Abe Ratnofsky  wrote:
>> >>
>> >> +1 to CASSDRIVER-JAVA et al.
>> >>
>> >> On Oct 17, 2024, at 10:37 AM, Jon Haddad 
>> wrote:
>> >>
>> >> Sgtm, let’s ship it
>> >>
>> >> +1
>> >>
>> >>
>> >>
>> >> On Thu, Oct 17, 2024 at 4:09 AM Brandon Williams 
>> wrote:
>> >>>
>> >>> Nobody wants to suggest a color for this bikeshed?  I'll start:
>> >>> CASSDRIVER-. I'd like to get on this sooner than later since
>> >>> during the time we wait the situation worsens.
>> >>>
>> >>> Kind Regards,
>> >>> Brandon
>> >>>
>> >>> On Wed, Oct 2, 2024 at 5:07 PM Brandon Williams 
>> wrote:
>> >>> >
>> >>> > I think we just need to ask infra to create the jira instances, but
>> I
>> >>> > guess we need to have some kind of consistent naming scheme to help
>> >>> > identify them?
>> >>> >
>> >>> > Kind Regards,
>> >>> > Brandon
>> >>> >
>> >>> > On Wed, Oct 2, 2024 at 1:02 PM Francisco Guerrero <
>> fran...@apache.org> wrote:
>> >>> > >
>> >>> > > +1 too on the points brought by Mick, we need more visibility into
>> >>> > > subprojects. For starters, we should look into integrating Qbot
>> >>> > > notifications in #cassandra-dev and #cassandra-noise for
>> >>> > > CASSANDRASC tickets. Let me know if I can help with that.
>> >>> > >
>> >>> > > On 2024/10/02 17:39:28 Yifan Cai wrote:
>> >>> > > > +1 on all the points raised by Mick. Please let me know if
>> there is
>> >>> > > > anything I can help with.
>> >>> > > >
>> >>> > > > - Yifan
>> >>> > > >
>> >>> > > > On Wed, Oct 2, 2024 at 8:13 AM Josh McKenzie <
>> jmcken...@apache.org> wrote:
>> >>> > > >
>> >>> > > > > - Qbot notifications in #cassandra-dev and #cassandra-noise ,
>> as well as
>> >>> > > > > in any subproject channels
>> >>> > > > > - some cadence of dev@ ML updates, e.g. on activities, or
>> dependency
>> >>> > > > > changes, etc
>> >>> > > > > - regular releases
>> >>> > > > >
>> >>> > > > > Agree on all 3 points. Also - I've *definitely* fallen off on
>> the project
>> >>> > > > > updates for mainline; I'll pick that back up after ApacheCon.
>> >>> > > > >
>> >>> > > > >
>> >>> > > > > On Wed, Oct 2, 2024, at 1:57 AM, Mick Semb Wever wrote:
>> >>> > > > >
>> >>> > > > > To play devil's advocate here, it's important that the
>> subprojects don't
>> >>> > > > > lose visibility and silo from the rest of the project.
>> >>> > > > >
>> >>> > > > > There are different ways to solve this, and lumping
>> everything into one
>> >>> > > > > jira project is a messy and poor way of doing it.  But as the
>> sidecar has
>> >>> > > > > shown us, subproject activity should somehow be made noisy to
>> us.  We need
>> >>> > > > > sorts of common spaces in the project.
>> >>> > > > >
>> >>> > > > > If we go the sep

Re: [DISCUSS] Modeling JIRA fix version for subprojects

2024-10-17 Thread Yifan Cai
yep. CASSANALYTICS sounds good to me. +1

On Thu, Oct 17, 2024 at 1:45 PM Francisco Guerrero 
wrote:

> > Can we include Cassandra Analytics to the infra ticket? I am looking
> > forward to jira project name suggestions for it...
>
> How about CASSANALYTICS ?
>
> On 2024/10/17 18:50:45 Yifan Cai wrote:
> > Can we include Cassandra Analytics to the infra ticket? I am looking
> > forward to jira project name suggestions for it...
> >
> > - Yifan
> >
> > On Thu, Oct 17, 2024 at 10:46 AM Patrick McFadin 
> wrote:
> >
> > > I think it needs a bit more blue. Maybe some pink stripes. I'll file a
> > > Jira.
> > >
> > > On Thu, Oct 17, 2024 at 9:01 AM Brandon Williams 
> wrote:
> > >
> > >> Thanks everyone, I've created
> > >> https://issues.apache.org/jira/browse/INFRA-26212
> > >>
> > >> Kind Regards,
> > >> Brandon
> > >>
> > >> On Thu, Oct 17, 2024 at 9:55 AM Ekaterina Dimitrova
> > >>  wrote:
> > >> >
> > >> > It would have been nice to be in red italic but… :-)
> > >> >
> > >> > Thanks, Brandon, +1 to the suggestion on my end too. Sounds
> reasonable
> > >> to me
> > >> >
> > >> >
> > >> > On Thu, 17 Oct 2024 at 17:50, Abe Ratnofsky  wrote:
> > >> >>
> > >> >> +1 to CASSDRIVER-JAVA et al.
> > >> >>
> > >> >> On Oct 17, 2024, at 10:37 AM, Jon Haddad 
> > >> wrote:
> > >> >>
> > >> >> Sgtm, let’s ship it
> > >> >>
> > >> >> +1
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Thu, Oct 17, 2024 at 4:09 AM Brandon Williams  >
> > >> wrote:
> > >> >>>
> > >> >>> Nobody wants to suggest a color for this bikeshed?  I'll start:
> > >> >>> CASSDRIVER-. I'd like to get on this sooner than later
> since
> > >> >>> during the time we wait the situation worsens.
> > >> >>>
> > >> >>> Kind Regards,
> > >> >>> Brandon
> > >> >>>
> > >> >>> On Wed, Oct 2, 2024 at 5:07 PM Brandon Williams  >
> > >> wrote:
> > >> >>> >
> > >> >>> > I think we just need to ask infra to create the jira instances,
> but
> > >> I
> > >> >>> > guess we need to have some kind of consistent naming scheme to
> help
> > >> >>> > identify them?
> > >> >>> >
> > >> >>> > Kind Regards,
> > >> >>> > Brandon
> > >> >>> >
> > >> >>> > On Wed, Oct 2, 2024 at 1:02 PM Francisco Guerrero <
> > >> fran...@apache.org> wrote:
> > >> >>> > >
> > >> >>> > > +1 too on the points brought by Mick, we need more visibility
> into
> > >> >>> > > subprojects. For starters, we should look into integrating
> Qbot
> > >> >>> > > notifications in #cassandra-dev and #cassandra-noise for
> > >> >>> > > CASSANDRASC tickets. Let me know if I can help with that.
> > >> >>> > >
> > >> >>> > > On 2024/10/02 17:39:28 Yifan Cai wrote:
> > >> >>> > > > +1 on all the points raised by Mick. Please let me know if
> > >> there is
> > >> >>> > > > anything I can help with.
> > >> >>> > > >
> > >> >>> > > > - Yifan
> > >> >>> > > >
> > >> >>> > > > On Wed, Oct 2, 2024 at 8:13 AM Josh McKenzie <
> > >> jmcken...@apache.org> wrote:
> > >> >>> > > >
> > >> >>> > > > > - Qbot notifications in #cassandra-dev and
> #cassandra-noise ,
> > >> as well as
> > >> >>> > > > > in any subproject channels
> > >> >>> > > > > - some cadence of dev@ ML updates, e.g. on activities, or
> > >> dependency
> > >> >>> > > > > changes, etc
> > >> >>> > > > > - regular releases
> > >> >>> > > > >
> > >> >>> >

Re: [DISCUSS] Modeling JIRA fix version for subprojects

2024-10-02 Thread Yifan Cai
+1 on all the points raised by Mick. Please let me know if there is
anything I can help with.

- Yifan

On Wed, Oct 2, 2024 at 8:13 AM Josh McKenzie  wrote:

> - Qbot notifications in #cassandra-dev and #cassandra-noise , as well as
> in any subproject channels
> - some cadence of dev@ ML updates, e.g. on activities, or dependency
> changes, etc
> - regular releases
>
> Agree on all 3 points. Also - I've *definitely* fallen off on the project
> updates for mainline; I'll pick that back up after ApacheCon.
>
>
> On Wed, Oct 2, 2024, at 1:57 AM, Mick Semb Wever wrote:
>
> To play devil's advocate here, it's important that the subprojects don't
> lose visibility and silo from the rest of the project.
>
> There are different ways to solve this, and lumping everything into one
> jira project is a messy and poor way of doing it.  But as the sidecar has
> shown us, subproject activity should somehow be made noisy to us.  We need
> sorts of common spaces in the project.
>
> If we go the separate jira project route, then some suggestions to help
> with this are:
> - Qbot notifications in #cassandra-dev and #cassandra-noise , as well as
> in any subproject channels
> - some cadence of dev@ ML updates, e.g. on activities, or dependency
> changes, etc
> - regular releases
>
>
> On Tue, 9 Apr 2024 at 04:11, Dinesh Joshi  wrote:
>
> hi folks - sorry to have dropped the ball on responding to this thread.
>
> My 2 cents are as follows -
>
> 1. Having a separate JIRA project for each sub-project will add management
> overhead. This option, however, allows us to model unique workflows for the
> sub-project.
>
> 2. Managing the sub-project as part of the Cassandra JIRA project would
> imply less management overhead but the sub-project would need to conform to
> the same workflows.
>
> I would pick option 1 unless there is a strong reason and desire to manage
> a separate Jira project. We can always split out the Java Driver project if
> things don't work out. OTOH merging a Jira project is harder.
>
> Thanks,
>
> Dinesh
>
> On Thu, Apr 4, 2024 at 12:45 PM Abe Ratnofsky  wrote:
>
> CEP-8 proposes using separate Jira projects per Cassandra sub-project:
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-8%3A+DataStax+Drivers+Donation
>
> > We suggest distinct Jira projects, one per driver, all to be created.
>
> I don't see any discussion changing that from the [DISCUSS] or vote
> threads:
> https://lists.apache.org/thread/01pljcncyjyo467l5orh8nf9okrh7oxm
> https://lists.apache.org/thread/opt630do09phh7hlt28odztxdv6g58dp
> https://lists.apache.org/thread/crolkrhd4y6tt3k4hsy204xomshlcp4p
>
> But looks like upon acceptance that was changed:
> https://lists.apache.org/thread/dhov01s8dvvh3882oxhkmmfv4tqdd68o
>
> > New issues will be tracked under the CASSANDRA project on Apache’s JIRA <
> https://issues.apache.org/jira/projects/CASSANDRA> under the component
> ‘Client/java-driver’.
>
> I'm in favor of using the same Jira as Cassandra proper. Committership is
> project-wide, so having a standardized process (same ticket flow, review
> rules, labels, etc. is beneficial). But multiple votes happened based on
> the content of the CEP, so we should stick to what was voted on and move to
> a separate Jira.
>
> --
> Abe
>
>
>


Re: [DISCUSS] Modeling JIRA fix version for subprojects

2024-10-24 Thread Yifan Cai
Thanks for working on this!

Another bikeshed I noticed is the project logo.

Currently, all of them share the same one as Sidecar. The subprojects can
be styled up.  :p

- Yifan

On Wed, Oct 23, 2024 at 9:46 AM Brandon Williams  wrote:

> https://issues.apache.org/jira/projects/CASSPYTHON/
> https://issues.apache.org/jira/projects/CASSJAVA/
> https://issues.apache.org/jira/projects/CASSGO/
>
> are now live.  If you know of any issues to move there, please do so.
>
> Kind Regards,
> Brandon
>
> On Wed, Oct 23, 2024 at 6:59 AM Brandon Williams  wrote:
> >
> > If nobody objects I will be creating the CASS jira projects later
> today.
> >
> > Kind Regards,
> > Brandon
> >
> > On Tue, Oct 22, 2024 at 10:35 AM Ekaterina Dimitrova
> >  wrote:
> > >
> > > Honestly, counting the letters was also a thing that happened to me
> but I should admit that even with CASSANDRAANALYTICS we count the As…
> > >
> > > My preference is CASSX
> > >
> > > Seems shorter and less painful to read to me as a user.
> > >
> > > Thanks
> > >
> > > On Tue, 22 Oct 2024 at 11:18, Patrick McFadin 
> wrote:
> > >>
> > >> CASS + NAME is my +1
> > >>
> > >> TBH rarely with this be typed. Just copied and pasted. It has to be
> > >> clear that naming is different from the other projects and I think we
> > >> get it either way.
> > >>
> > >> On Tue, Oct 22, 2024 at 8:15 AM Štefan Miklošovič
> > >>  wrote:
> > >> >
> > >> > Something like this?
> > >> >
> > >> > CASSANDRA
> > >> > CASSPYTHON
> > >> > CASSGO
> > >> > CASSJAVA
> > >> > CASSSIDECAR
> > >> > CASSANALYTICS
> > >> >
> > >> > if we expand it would be like
> > >> >
> > >> > CASSANDRA
> > >> > CASSANDRAPYTHON
> > >> > CASSANDRAGO
> > >> > CASSANDRAJAVA
> > >> > CASSANDRASIDECAR
> > >> > CASSANDRAANALYTICS
> > >> >
> > >> > I don't know ... the first form seems fine to me but that triple S
> in CASSSIDECAR is strange. I just find myself counting S's when I type it.
> > >> >
> > >> > Up to you guys. I don't mind both.
> > >> >
> > >> > On Tue, Oct 22, 2024 at 5:01 PM Brandon Williams 
> wrote:
> > >> >>
> > >> >> I don't think underscore is an option from selfserve anyway.  If we
> > >> >> have to stick everything together then I think having fewer things
> is
> > >> >> better, so we could drop the 'driver' and just name things like
> > >> >> CASSPYTHON.  WDYT?
> > >> >>
> > >> >> Kind Regards,
> > >> >> Brandon
> > >> >>
> > >> >> On Tue, Oct 22, 2024 at 9:33 AM Štefan Miklošovič
> > >> >>  wrote:
> > >> >> >
> > >> >> > So we will have stuff like
> > >> >> >
> > >> >> > CASS_DRIVER_PYTHON and all tickets in CHANGES.txt as well as in
> the commit messages will be like
> > >> >> >
> > >> >> > CASS_DRIVER_PYTHON-1234
> > >> >> >
> > >> >> > I checked (1) and there is not a single one which has
> underscores in its name, now THAT would be a precedent, wouldn't it ...
> > >> >> >
> > >> >> > (1)
> https://issues.apache.org/jira/secure/BrowseProjects.jspa?selectedCategory=all&selectedProjectType=all
> > >> >> >
> > >> >> > On Tue, Oct 22, 2024 at 4:17 PM Martin Sucha <
> martin.su...@kiwi.com> wrote:
> > >> >> >>
> > >> >> >> This seems to be relevant documentation:
> https://confluence.atlassian.com/adminjiraserver/changing-the-project-key-format-938847081.html
> > >> >> >>
> > >> >> >> Martin
> > >> >> >>
> > >> >> >> 
> > >> >> >> This email, including attached files, may contain confidential
> information and is intended only for the use of the individual and/or
> entity to which it is addressed. If you are not the intended recipient,
> disclosure, copying, use, or distribution of the information included in
> this email and/or in its attachments is prohibited.
> > >> >> >> If you have received it by mistake, please do not read, copy or
> use it, or disclose its contents to others. Please notify the sender that
> you have received this email by mistake by replying to the email, and then
> delete the email and any copies and attachments of it. Thank you.
>


Re: [VOTE] CEP-42: Constraints Framework

2024-10-24 Thread Yifan Cai
Hello, everyone.

I’ve been reviewing the patch for the constraints framework
<https://github.com/apache/cassandra/pull/3562>, and I believe there are
several aspects outlined in CEP-42 that warrant reconsideration. I’d like
to bring these points up for discussion.
*1. New Reserved Keyword*

The patch introduces a new reserved keyword, "CONSTRAINT." Since reserved
keywords cannot be used as identifiers unless quoted, this can complicate
data definition declarations. We should aim to avoid adding new reserved
keywords where possible. Here are a couple of alternatives:

1.1 *Inline Constraint Definition*

We could eliminate the keyword "CONSTRAINT." Instead, similar to data
masking, constraints could be defined using "CONSTRAINED WITH." For
example, in the following code, r_value_range_lower_bound and
r_value_range_upper_bound are constraint names, followed immediately by
their expressions, with multiple constraints connected using "AND".

CREATE TABLE rgb (
  name text PRIMARY KEY,
  r int CONSTRAINED WITH r_value_range_lower_bound CHECK r >= 0 AND
r_value_range_upper_bound CHECK r < 256,
  ...
);

1.2 *Special Symbol*

Another option is to use a special symbol to differentiate from
identifiers, such as "@CONSTRAINT." However, since there is currently no
annotation-like concept in CQL, this might confuse users.

CREATE TABLE rgb (
  name text PRIMARY KEY,
  r int,
  ...
  @CONSTRAINT r_value_range_lower_bound CHECK r >= 0,
  @CONSTRAINT r_value_range_upper_bound CHECK r < 256,
  ...
);

*2. Constraint Name*

CEP-42 states, "Name of the constraint is optional. If it is not provided,
a name is generated for the constraint."

However, based on the actual statements defining constraints, I believe
names should be *mandatory* for clarity and usability. System-generated
names often lack descriptiveness.
*3. Cross-Column Constraints*

CEP-42 proposes allowing constraints that compare multiple columns. For
example,

CREATE TABLE keyspace.table (
  p1 int,
  p2 int,
  ...,
  CONSTRAINT [name] CHECK (p1 != p2)
);

Such constraints can be problematic due to their referential nature.
Consider scenarios where column p2 is dropped, or when insert/update
operations include only partial values (e.g., only inserting p1). Should
the query result in a read (before write), or should it fail due to
incomplete values?

For simplicity, I propose that, at least for the initial iteration, we
exclude support for cross-column constraints. In other words, constraints
should only check the values of individual columns.

- Yifan

On Thu, Sep 19, 2024 at 11:46 AM Patrick McFadin  wrote:

> Thanks for the update. My inbox search failed me :D
>
> On Thu, Sep 19, 2024 at 11:31 AM Bernardo Botella <
> conta...@bernardobotella.com> wrote:
>
>> Hi Patrick,
>>
>> Thanks for taking a look at this and keeping the house tidy.
>>
>> I announced the voting results on a sepparate thread:
>> lists.apache.org
>> <https://lists.apache.org/thread/v73cwc8p80xx7zpkldjq6w1qrkf2k9h0>
>> [image: favicon.ico]
>> <https://lists.apache.org/thread/v73cwc8p80xx7zpkldjq6w1qrkf2k9h0>
>> <https://lists.apache.org/thread/v73cwc8p80xx7zpkldjq6w1qrkf2k9h0>
>>
>> As a follow up, this is not stalled, and I’m currently working on a patch
>> that will be soon available for review.
>>
>> Thanks,
>> Bernardo
>>
>>
>> On Sep 19, 2024, at 11:20 AM, Patrick McFadin  wrote:
>>
>> I'm going to cap this thread. Vote passes with no binding -1s.
>>
>> On Tue, Jul 2, 2024 at 2:25 PM Jordan West  wrote:
>>
>>> +1
>>>
>>> On Tue, Jul 2, 2024 at 12:15 Francisco Guerrero 
>>> wrote:
>>>
>>>> +1
>>>>
>>>> On 2024/07/02 18:45:33 Josh McKenzie wrote:
>>>> > +1
>>>> >
>>>> > On Tue, Jul 2, 2024, at 1:18 PM, Abe Ratnofsky wrote:
>>>> > > +1 (nb)
>>>> > >
>>>> > >> On Jul 2, 2024, at 12:15 PM, Yifan Cai  wrote:
>>>> > >>
>>>> > >> +1 on CEP-42.
>>>> > >>
>>>> > >> - Yifan
>>>> > >>
>>>> > >> On Tue, Jul 2, 2024 at 5:17 AM Jon Haddad 
>>>> wrote:
>>>> > >>> +1
>>>> > >>>
>>>> > >>> On Tue, Jul 2, 2024 at 5:06 AM  wrote:
>>>> > >>>> +1
>>>> > >>>>
>>>> > >>>>
>>>> > >>>>> On Jul 1, 2024, at 8:34 PM, Doug Rohrer 
>>>> wrote:
>>>> > >>>>>
>>>> > >>>>> +1 (nb) - Thanks fo

Re: [DISCUSS] Usage of "var" instead of types in the code

2024-10-29 Thread Yifan Cai
I am in favor of *disallowing* the `var` keyword.

It does not provide a good readability, especially in the environments w/o
type inference, e.g. text editor or github site.

It could introduce performance degradation without being noticed. Consider
the following code for example,

Set allNames()
{
return null;
}

boolean contains(String name)
{
var names = allNames();
return names.contains(name);
}

Then, allNames is refactored to return List later. The contains method then
runs slower.

List allNames()
{
return null;
}


- Yifan

On Tue, Oct 29, 2024 at 11:53 AM Josh McKenzie  wrote:

> (sorry for the double-post)
>
> Jeremy Hanna kicked this link to a style guideline re: inference my way.
> Interesting read for those that are curious:
> https://openjdk.org/projects/amber/guides/lvti-style-guide
>
> On Tue, Oct 29, 2024, at 2:47 PM, Josh McKenzie wrote:
>
> To illustrate my position from above:
>
> Good usage:
>
> Collection names = new ArrayList<>();
>
> becomes
>
> var names = new ArrayList();
>
>
> Bad usage:
>
> Collection names = myObject.getNames();
>
> becomes
>
> var names = myObject.getNames();
>
>
> Effectively, anything that's not clearly redundant in assignment shouldn't
> use inference IMO. Thinking more deeply on this as well, I think part of
> what I haven't loved is the effective splitting of type information when
> constructing generics:
>
> Map failureReasonsbyEndpoint =
> new ConcurrentHashMap<>();
>
> vs.
>
> var failureReasonsByEndpoint = new ConcurrentHashMap RequestFailureReason>();
>
>
> I strongly agree that we should optimize for readability, and I think
> using type inference to the extreme of every case where it's allowable
> would be the opposite of that. That said, I do believe there's cases where
> judicious use of type inference make a codebase *more* readable rather
> than less.
>
> All that said, accommodating nuance is hard when it comes to style
> guidelines. A clean simple policy of "don't use type inference outside of
> testing code" is probably more likely to hold up over time for us than
> having more nuanced guidelines.
>
> On Tue, Oct 29, 2024, at 2:19 PM, Štefan Miklošovič wrote:
>
> Yes, for now it is pretty much just in SAI. I wanted to know if this is a
> thing from now on or where we are at with that ...
>
> I am afraid that if we don't make this "right" then we will end up with a
> codebase with inconsistent usage of that and it will be even worse to
> navigate in it in the long term.
>
> I would either ban its usage or allow it only in strictly enumerated
> situations. However, that is just hard to check upon reviews with 100%
> accuracy and I don't think there is some "checker" to check allowed usages
> for us. That being said and to be on the safe side of things I would just
> ban it completely.
>
> Sometimes I am just reading the code from GitHub and it might be also
> tricky to review PRs. Not absolutely every PR is reviewed in IDE, some
> reviews are given without automatically checking it in IDE too and it would
> just make life harder for reviewers if they had to figure out what the
> types are etc ...
>
> On Tue, Oct 29, 2024 at 7:10 PM Brandon Williams  wrote:
>
> On Tue, Oct 29, 2024 at 12:15 PM Štefan Miklošovič
>  wrote:
> > I think this is a new concept here which was introduced recently with
> support of Java 11 / Java 17 after we dropped 8.
>
> To put a finer point on that, 4.1 has 3 hits, none of which are valid,
> while 5.0 has 172.  If 'sai' is added to the 5.0 grep, 85% of them are
> retained.
>
> Kind Regards,
> Brandon
>
>
>
>


Re: [VOTE] CEP-43: Apache Cassandra CREATE TABLE LIKE

2024-11-08 Thread Yifan Cai
+1 (nb)

- Yifan

On Thu, Nov 7, 2024 at 10:31 PM guo Maxwell  wrote:

> Thanks Stefan and Dinesh. Let's wait a little longer.
>
>
> Dinesh Joshi  于2024年11月7日周四 23:56写道:
>
>> Maxwell, here's the documentation for project governance for reference -
>> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance
>>
>> Like Stefan said, please wait until more binding +1s come in.
>>
>> I'm +1 on the CEP.
>>
>> thanks,
>>
>> Dinesh
>>
>>
>> On Thu, Nov 7, 2024 at 7:51 AM Štefan Miklošovič 
>> wrote:
>>
>>> Hi Maxwell,
>>>
>>> any CEP in general requires more binding votes than 1 from myself.
>>>
>>> I advise you to wait a little bit longer until at least three binding
>>> votes accumulate.
>>>
>>> Regards
>>>
>>> On Thu, Nov 7, 2024 at 4:42 PM guo Maxwell  wrote:
>>>
>>>> Thank you everyone. If there is no other feedback, I feel that this
>>>> vote has passed and CEP-43 is adopted.
>>>>
>>>> Bernardo Botella  于2024年11月6日周三 23:40写道:
>>>>
>>>>> +1 (nb)
>>>>>
>>>>> Thanks a lot Guo for addressing all the comments!
>>>>>
>>>>> On Nov 6, 2024, at 7:21 AM, Štefan Miklošovič 
>>>>> wrote:
>>>>>
>>>>> Having all cleared out in discussion thread (1), I think we can
>>>>> finally vote on this.
>>>>>
>>>>> +1
>>>>>
>>>>> I welcome everybody to finish this vote or raise other issues in the
>>>>> discussion thread if any.
>>>>>
>>>>> (1) https://lists.apache.org/thread/2z09twbrv75rszpxbm1przxxohpjvkkl
>>>>>
>>>>> On Mon, Nov 4, 2024 at 2:53 AM guo Maxwell 
>>>>> wrote:
>>>>>
>>>>>> Now at this point I think we can continue  the voting for CEP-43 as
>>>>>> all the feedback in the discussion thread seems to be addressed.
>>>>>>
>>>>>> Proposal: CEP43-CREATE TABLE LIKE
>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>> Discussion thread:  discussion
>>>>>> <https://lists.apache.org/list?dev@cassandra.apache.org:lte=1M:create%20table%20like>
>>>>>>
>>>>>> As per the CEP process documentation, this vote will be open for 72
>>>>>> hours (longer if needed).
>>>>>>
>>>>>> Bernardo Botella  于2024年10月16日周三
>>>>>> 07:40写道:
>>>>>>
>>>>>>> Fair point. I will move my feedback there.
>>>>>>>
>>>>>>> On Oct 15, 2024, at 4:19 PM, Yifan Cai  wrote:
>>>>>>>
>>>>>>> For further discussions, should we use the discussion thread? This
>>>>>>> thread is for voting.
>>>>>>>
>>>>>>> - Yifan
>>>>>>>
>>>>>>> On Tue, Oct 15, 2024 at 3:31 PM Bernardo Botella <
>>>>>>> conta...@bernardobotella.com> wrote:
>>>>>>>
>>>>>>>> Hi Guo,
>>>>>>>>
>>>>>>>> Do you think it would make sense to add a fourth keyword to add
>>>>>>>> after the WITH for Constraints? (See CEP-42)
>>>>>>>>
>>>>>>>> Copying a table without the defined constraints may be useful.
>>>>>>>>
>>>>>>>> Bernardo
>>>>>>>>
>>>>>>>>
>>>>>>>> On Oct 9, 2024, at 9:32 PM, guo Maxwell 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> ok, I think the time can be two weeks .
>>>>>>>>
>>>>>>>> Looking forward to your feedback.
>>>>>>>>
>>>>>>>> Abe Ratnofsky  于2024年10月10日周四 11:51写道:
>>>>>>>>
>>>>>>>>> With the CEP only being completed last week and the Community over
>>>>>>>>> Code conference finishing up this week, I'd love to have a few more 
>>>>>>>>> days to
>>>>>>>>> review and discuss the proposal.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>


Re: [DISCUSS] Modeling JIRA fix version for subprojects

2024-10-01 Thread Yifan Cai
I support the idea of having separate Jira projects. Based on my experience
with both shared namespaces (like Cassandra and Analytics) and dedicated
namespaces (like Sidecar), I've seen the drawbacks of grouping all
subproject tickets under a single project, i.e. Cassandra.

When tickets are consolidated in one project, visibility suffers. For
instance, tickets must have a prefix in their titles, like in this example:
https://issues.apache.org/jira/browse/CASSANDRA-19927. It's not immediately
clear that this ticket pertains to the Analytics subproject without
clicking the link.

Additionally, using just the Cassandra project leads to project
metadata—such as "components" and "labels"—that may not apply to other
subprojects. This can create confusion. In contrast, having distinct Jira
projects ensures that project-specific metadata is well organized and
relevant.

On the other hand, the Cassandra Sidecar has its own dedicated Jira
project, which avoids these issues entirely.

- Yifan

On Tue, Oct 1, 2024 at 7:27 AM Brandon Williams  wrote:

> CEP-8 says "We suggest distinct Jira projects, one per driver, all to
> be created."
>
> Kind Regards,
> Brandon
>
> On Tue, Oct 1, 2024 at 9:23 AM Jon Haddad  wrote:
> >
> > My 2 cents - trying to look through C* JIRA right now is kind of awful
> with different projects all mixed in.  Given that the decision to lump
> everything together seems to have been made unilaterally, against the VOTE,
> I'd say we still need to move drivers off CASSANDRA.
> >
> > Only question is, one for all drivers or one for each driver?
> >
> > Jon
> >
> > On Tue, Oct 1, 2024 at 10:16 AM Brandon Williams 
> wrote:
> >>
> >> What is the status of this thread? Are we looking to move each driver
> >> project to its own jira instance, as voted for in CEP-8?
> >>
> >> Kind Regards,
> >> Brandon
> >>
> >> On Tue, Apr 9, 2024 at 9:29 AM Brandon Williams 
> wrote:
> >> >
> >> > I am +1 on separate projects as well, but to Abe's point I don't think
> >> > it matters now, we had 21 binding votes for CEP-8 which spells this
> >> > out.
> >> >
> >> > Kind Regards,
> >> > Brandon
> >> >
> >> > On Tue, Apr 9, 2024 at 9:24 AM Josh McKenzie 
> wrote:
> >> > >
> >> > > +1 to separate JIRA projects per subproject. Having workflows
> distinct to each project is reason enough for me, nevermind the global
> namespace pollution that occurs if you pack a bunch of disparate projects
> together into one instance.
> >> > >
> >> > > On Mon, Apr 8, 2024, at 9:11 PM, Dinesh Joshi wrote:
> >> > >
> >> > > hi folks - sorry to have dropped the ball on responding to this
> thread.
> >> > >
> >> > > My 2 cents are as follows -
> >> > >
> >> > > 1. Having a separate JIRA project for each sub-project will add
> management overhead. This option, however, allows us to model unique
> workflows for the sub-project.
> >> > >
> >> > > 2. Managing the sub-project as part of the Cassandra JIRA project
> would imply less management overhead but the sub-project would need to
> conform to the same workflows.
> >> > >
> >> > > I would pick option 1 unless there is a strong reason and desire to
> manage a separate Jira project. We can always split out the Java Driver
> project if things don't work out. OTOH merging a Jira project is harder.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Dinesh
> >> > >
> >> > > On Thu, Apr 4, 2024 at 12:45 PM Abe Ratnofsky  wrote:
> >> > >
> >> > > CEP-8 proposes using separate Jira projects per Cassandra
> sub-project:
> >> > >
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-8%3A+DataStax+Drivers+Donation
> >> > >
> >> > > > We suggest distinct Jira projects, one per driver, all to be
> created.
> >> > >
> >> > > I don't see any discussion changing that from the [DISCUSS] or vote
> threads:
> >> > > https://lists.apache.org/thread/01pljcncyjyo467l5orh8nf9okrh7oxm
> >> > > https://lists.apache.org/thread/opt630do09phh7hlt28odztxdv6g58dp
> >> > > https://lists.apache.org/thread/crolkrhd4y6tt3k4hsy204xomshlcp4p
> >> > >
> >> > > But looks like upon acceptance that was changed:
> >> > > https://lists.apache.org/thread/dhov01s8dvvh3882oxhkmmfv4tqdd68o
> >> > >
> >> > > > New issues will be tracked under the CASSANDRA project on
> Apache’s JIRA  under
> the component ‘Client/java-driver’.
> >> > >
> >> > > I'm in favor of using the same Jira as Cassandra proper.
> Committership is project-wide, so having a standardized process (same
> ticket flow, review rules, labels, etc. is beneficial). But multiple votes
> happened based on the content of the CEP, so we should stick to what was
> voted on and move to a separate Jira.
> >> > >
> >> > > --
> >> > > Abe
> >> > >
> >> > >
>


Re: [VOTE] CEP-42: Constraints Framework

2024-10-25 Thread Yifan Cai
Hi Štefan,

The constraint names are to be referenced when altering tables.

I like the option you proposed to completely overwrite the column
constraints during table alterations, removing the need to declare
constraint names. It simplifies the constraint definition.

To iterate on the use case of dropping constraints of a column entirely,
the following might read clearer.

ALTER TABLE ks.table DROP CONSTRAINTS ON column_name;


To patch the constraints on a column, what you proposed makes perfect sense
to me.

- Yifan

On Fri, Oct 25, 2024 at 9:27 AM Štefan Miklošovič 
wrote:

> I think you need to name the constraints because you want to do something
> like this, correct?
>
> ALTER TABLE keyspace.table ALTER CONSTRAINT [name] CHECK (condition)
>
> But that is only necessary when there are multiple constraints on a column
> and you want to alter either / both of them.
>
> If we had this syntax:
>
> CREATE TABLE ks.tb (id int, a int CONSTRAINED WITH a > 10);
>
> Then you can alter without name like this:
>
> ALTER TABLE ks.tb ALTER a CONSTRAINED WITH a > 10;
> ALTER TABLE ks.tb ALTER a CONSTRAINED WITH a > 10 AND a < 15;
>
> And we can drop it like this:
>
> ALTER TABLE keyspace.table DROP CONSTRAINT a;
>
> If we have two constraints like this:
>
> CREATE TABLE ks.tb (id int, a int CONSTRAINED WITH a > 10 AND a < 20);
>
> Then it is true that doing this
>
> ALTER TABLE keyspace.table DROP CONSTRAINT a;
>
> would drop BOTH of them. Yes. But on the other hand, I am not sure we can
> justify the alternation on _individual_ constraints by adding complexity.
> Who is actually going to alter just one constraint / part of it anyway?
>
> If I had this:
>
> CREATE TABLE ks.tb (id int, a int CONSTRAINED WITH a > 10 AND a < 20);
>
> And I wanted to have just a > 10 and drop a < 20 then I would do:
>
> ALTER TABLE ks.tb ALTER a CONSTRAINED WITH a > 10;
>
> Instead of
>
> ALTER TABLE keyspace.table DROP CONSTRAINT
> some_name_for_a_lower_than_20;
>
> On Fri, Oct 25, 2024 at 5:18 PM Štefan Miklošovič 
> wrote:
>
>> Thinking about this more ..
>>
>> CREATE TABLE rgb ( name text PRIMARY KEY, r int CONSTRAINED WITH
>> r_value_range_lower_bound CHECK r >= 0 AND r_value_range_upper_bound
>> CHECK r < 256, ... );
>>
>> What about this:
>>
>> CREATE TABLE rgb ( name text PRIMARY KEY, r int CONSTRAINED WITH r >= 0
>> AND r < 256, ... );
>>
>> Why do we need to have names and CHECK after all? I am sorry if this was
>> already answered and I am glad to be educated in this area.
>>
>> Regards
>>
>> On Fri, Oct 25, 2024 at 5:13 PM Štefan Miklošovič 
>> wrote:
>>
>>> 1.1.
>>> CONSTRAINED WITH is good for me
>>>
>>> 1.2
>>> I prefer 1.1. approach.
>>>
>>> 2.
>>> I am for explicit names over generated ones. I think that the only names
>>> which are generated are names for indexes when not specified.
>>>
>>> 3. I am OK with the exclusion. This is an interesting problem. If
>>> somebody wants these two to be constrained and checked then I guess the
>>> solution would be to have them both in a tuple instead of in two different
>>> columns. So we do not need to support this cross-columns feature. However,
>>> I am not sure how we would go around checking tuples. Is that covered? We
>>> would need to find a way how to reference that
>>>
>>> create table a_table (int id, a_tuple tuple, CONSTRAINT
>>> a_tuple_constraint CHECK (a_tuple.1 != a_tuple.2)
>>>
>>> or something similar.
>>>
>>> BTW there is nothing about tuples in that CEP yet.
>>>
>>>
>>>
>>> On Fri, Oct 25, 2024 at 12:21 AM Yifan Cai  wrote:
>>>
>>>> Hello, everyone.
>>>>
>>>> I’ve been reviewing the patch for the constraints framework
>>>> <https://github.com/apache/cassandra/pull/3562>, and I believe there
>>>> are several aspects outlined in CEP-42 that warrant reconsideration. I’d
>>>> like to bring these points up for discussion.
>>>> *1. New Reserved Keyword*
>>>>
>>>> The patch introduces a new reserved keyword, "CONSTRAINT." Since
>>>> reserved keywords cannot be used as identifiers unless quoted, this can
>>>> complicate data definition declarations. We should aim to avoid adding new
>>>> reserved keywords where possible. Here are a couple of alternatives:
>>>>
>>>> 1.1 *Inline Constraint Definit

Re: [VOTE] CEP-42: Constraints Framework

2024-10-25 Thread Yifan Cai
The identifier "a" in the statement "DROP CONSTRAINT a;" might be mistaken
for a constraint name.

Revising it to "DROP CONSTRAINTS ON a" more clearly conveys the intent of
removing all constraints defined on column "a". However, it requires
CONSTRAINTS to be added to the reserved keywords. I would propose a new
iteration.

ALTER TABLE ks.table ALTER [IF EXISTS]  DROP CONSTRAINTS;


Thank you for providing additional examples to illustrate the unnecessity
of constraint names.

- Yifan

On Fri, Oct 25, 2024 at 11:16 AM Yifan Cai  wrote:

> Hi Štefan,
>
> The constraint names are to be referenced when altering tables.
>
> I like the option you proposed to completely overwrite the column
> constraints during table alterations, removing the need to declare
> constraint names. It simplifies the constraint definition.
>
> To iterate on the use case of dropping constraints of a column entirely,
> the following might read clearer.
>
> ALTER TABLE ks.table DROP CONSTRAINTS ON column_name;
>
>
> To patch the constraints on a column, what you proposed makes perfect
> sense to me.
>
> - Yifan
>
> On Fri, Oct 25, 2024 at 9:27 AM Štefan Miklošovič 
> wrote:
>
>> I think you need to name the constraints because you want to do something
>> like this, correct?
>>
>> ALTER TABLE keyspace.table ALTER CONSTRAINT [name] CHECK (condition)
>>
>> But that is only necessary when there are multiple constraints on a
>> column and you want to alter either / both of them.
>>
>> If we had this syntax:
>>
>> CREATE TABLE ks.tb (id int, a int CONSTRAINED WITH a > 10);
>>
>> Then you can alter without name like this:
>>
>> ALTER TABLE ks.tb ALTER a CONSTRAINED WITH a > 10;
>> ALTER TABLE ks.tb ALTER a CONSTRAINED WITH a > 10 AND a < 15;
>>
>> And we can drop it like this:
>>
>> ALTER TABLE keyspace.table DROP CONSTRAINT a;
>>
>> If we have two constraints like this:
>>
>> CREATE TABLE ks.tb (id int, a int CONSTRAINED WITH a > 10 AND a < 20);
>>
>> Then it is true that doing this
>>
>> ALTER TABLE keyspace.table DROP CONSTRAINT a;
>>
>> would drop BOTH of them. Yes. But on the other hand, I am not sure we can
>> justify the alternation on _individual_ constraints by adding complexity.
>> Who is actually going to alter just one constraint / part of it anyway?
>>
>> If I had this:
>>
>> CREATE TABLE ks.tb (id int, a int CONSTRAINED WITH a > 10 AND a < 20);
>>
>> And I wanted to have just a > 10 and drop a < 20 then I would do:
>>
>> ALTER TABLE ks.tb ALTER a CONSTRAINED WITH a > 10;
>>
>> Instead of
>>
>> ALTER TABLE keyspace.table DROP CONSTRAINT
>> some_name_for_a_lower_than_20;
>>
>> On Fri, Oct 25, 2024 at 5:18 PM Štefan Miklošovič 
>> wrote:
>>
>>> Thinking about this more ..
>>>
>>> CREATE TABLE rgb ( name text PRIMARY KEY, r int CONSTRAINED WITH
>>> r_value_range_lower_bound CHECK r >= 0 AND r_value_range_upper_bound
>>> CHECK r < 256, ... );
>>>
>>> What about this:
>>>
>>> CREATE TABLE rgb ( name text PRIMARY KEY, r int CONSTRAINED WITH r >= 0
>>> AND r < 256, ... );
>>>
>>> Why do we need to have names and CHECK after all? I am sorry if this was
>>> already answered and I am glad to be educated in this area.
>>>
>>> Regards
>>>
>>> On Fri, Oct 25, 2024 at 5:13 PM Štefan Miklošovič <
>>> smikloso...@apache.org> wrote:
>>>
>>>> 1.1.
>>>> CONSTRAINED WITH is good for me
>>>>
>>>> 1.2
>>>> I prefer 1.1. approach.
>>>>
>>>> 2.
>>>> I am for explicit names over generated ones. I think that the only
>>>> names which are generated are names for indexes when not specified.
>>>>
>>>> 3. I am OK with the exclusion. This is an interesting problem. If
>>>> somebody wants these two to be constrained and checked then I guess the
>>>> solution would be to have them both in a tuple instead of in two different
>>>> columns. So we do not need to support this cross-columns feature. However,
>>>> I am not sure how we would go around checking tuples. Is that covered? We
>>>> would need to find a way how to reference that
>>>>
>>>> create table a_table (int id, a_tuple tuple, CONSTRAINT
>>>> a_tuple_constraint CHECK (a_tuple.1 != a_tuple.2)
>>>>
>>>> or som

Re: [VOTE] CEP-43: Apache Cassandra CREATE TABLE LIKE

2024-10-15 Thread Yifan Cai
For further discussions, should we use the discussion thread? This thread
is for voting.

- Yifan

On Tue, Oct 15, 2024 at 3:31 PM Bernardo Botella <
conta...@bernardobotella.com> wrote:

> Hi Guo,
>
> Do you think it would make sense to add a fourth keyword to add after the
> WITH for Constraints? (See CEP-42)
>
> Copying a table without the defined constraints may be useful.
>
> Bernardo
>
>
> On Oct 9, 2024, at 9:32 PM, guo Maxwell  wrote:
>
> ok, I think the time can be two weeks .
>
> Looking forward to your feedback.
>
> Abe Ratnofsky  于2024年10月10日周四 11:51写道:
>
>> With the CEP only being completed last week and the Community over Code
>> conference finishing up this week, I'd love to have a few more days to
>> review and discuss the proposal.
>
>
>


Re: [DISCUSS] Introduce CREATE TABLE LIKE grammer

2024-10-15 Thread Yifan Cai
Thanks for creating the CEP! I think it is missing Bernardo's comment on
"the need for read permissions on the source table".

CreateTableStatement does not check the permissions outside of the
enclosing keyspace. Having the SELECT permission on the original table is a
requirement for CREATE TABLE LIKE.

- Yifan

On Sun, Sep 29, 2024 at 11:01 PM guo Maxwell  wrote:

> Hello, everyone ,
> I have finished the doc for CEP-43 for CREATE_TABLE_LIKE
> 
>  as
> said before, looking forward to your suggestions.
>
> Štefan Miklošovič  于2024年9月25日周三 03:51写道:
>
>> I am sorry I do not follow what you mean, maybe an example would help.
>>
>> On Tue, Sep 24, 2024 at 6:18 PM guo Maxwell  wrote:
>>
>>>
>>> If there are multiple schema information changes in one ddl statement,
>>> will there be schema conflicts in extreme cases?
>>> For example, our statement contains both table creation and index
>>> creation.
>>>
>>> guo Maxwell 于2024年9月24日 周二下午8:12写道:
>>>
 +1 on splitting this task  and adding the ability to copy tables
 through different keyspaces in the future.

 Štefan Miklošovič  于2024年9月23日周一 22:05写道:

> If we have this table
>
> CREATE TABLE ks.tb2 (
> id int PRIMARY KEY,
> name text
> );
>
> I can either specify name of an index on my own like this:
>
> CREATE INDEX name_index ON ks.tb2 (name) ;
>
> or I can let Cassandra to figure that name on its own:
>
> CREATE INDEX ON ks.tb2 (name) ;
>
> in that case it will name that index "tb2_name_idx".
>
> Hence, I would expect that when we do
>
> ALTER TABLE ks.to_copy LIKE ks.tb2 WITH INDICES;
>
> Then ks.to_copy table will have an index which is called
> "to_copy_name_idx" without me doing anything.
>
> For types, we do not need to do anything when we deal with the same
> keyspace. For simplicity, I mentioned that we might deal with the same
> keyspace scenario only for now and iterate on that in the future.
>
> On Mon, Sep 23, 2024 at 8:53 AM guo Maxwell 
> wrote:
>
>> Hello everyone,
>>
>> Cep is being written, and I encountered some problems during the
>> process. I would like to discuss them with you. If you read the 
>> description
>> of this CASSANDRA-7662
>> , we will find
>> that initially the original creator of this jira did not intend to
>> implement structural copying of indexes, views, and triggers  only the
>> column and its data type.
>>
>> However, after investigating some db related syntax and function
>> implementation, I found that it may be necessary for us to provide some
>> rich syntax to support the replication of indexes, views, etc.
>>
>> In order to support selective copy of the basic structure of the
>> table (columns and types), table options, table-related indexes, views,
>> triggers, etc. We need some new syntax, it seems that the syntax of pg is
>> relatively comprehensive, it use the keyword INCLUDING/EXCLUDING to
>> flexibly control the removal and retention of indexes, table information,
>> etc. see pg create table like
>>  , the new
>> created index name is different from the original table's index name , 
>> seenewly
>> copied index names are different from original
>> 
>> , the name is based on some rule.
>> Mysql is relatively simple and copies columns and indexes by default.
>> see mysql create table like
>>  and
>> the newly created index name is the same with the original table's index
>> name.
>>
>> So for Casandra, I hope it can also support the information copy of
>> index and even view/trigger. And I also hope to be able to flexibly 
>> decide
>> which information is copied like pg.
>>
>> Besides, I think the copy can happen between different keyspaces. And
>> UDT needs to be taken into account.
>>
>> But as we know the index/view/trigger name are all under keyspace
>> level, so it seems that the newly created index name (or view name/ 
>> trigger
>> name) must be different from the original tables' ,otherwise  names would
>> clash .
>>
>> So regarding the above problem, one idea I have is that for newly
>> created types, indexes and views under different keyspaces and the same
>> keyspace, we first generate random names for them, and then we can add 
>> the
>> ability of modifying the names(for types/indexes/views/triggers) so that
>> users can manually change the names.
>>
>>
>> guo Maxwell  于20

Re: [DISCUSS] Fine grained max size guardails

2025-02-08 Thread Yifan Cai
Thanks for the example.

"SIZE" is in fact "SERIALIZED_SIZE".

The term size and length are mostly interchangeable. Some modifiers on size
will be required in order to distinguish.

- Yifan

On Sat, Feb 8, 2025 at 8:50 PM Bernardo Botella <
conta...@bernardobotella.com> wrote:

> Yifan: how is the SIZE constraint from the LENGTH constraint? -> I think
> you are asking how are they different? They are similar, but not exactly
> the same. And it will depend on the actual type of the column they are
> added. For example, for a blob, both SIZE and LENGTH would be equivalent.
> But, for strings, they are difference. For the string “foo”, LENGTH would
> be 3, but size would be bigger than 3 (depending on the actual encoding
> used).
>
>
> On Feb 8, 2025, at 7:58 PM, Yifan Cai  wrote:
>
> It makes sense to me to have both guardrails (which is for operators) and
> constraints (which is for app owners) to define size limits. Besides the
> difference in the target audience groups, the scope where guardrail and
> constraints are applicable also differs.
>
> However, it is unnecessary to reject constraints definition if it goes
> beyond the relevant guardrail, as long as the write failure indicates
> whether the size violates the guardrail or column constraint, which should
> be propagated to clients for transparency.
>
> Btw, how is the SIZE constraint from the LENGTH constraint?
>
> - Yifan
>
> On Sat, Feb 8, 2025 at 6:25 PM Bernardo Botella <
> conta...@bernardobotella.com> wrote:
>
>> Thanks everyone for the inputs.
>>
>> Dinesh: "constraint should not violate the max bound of the guardrail” ->
>> Yes, that statement is true with the proposed patch. With code as is, the
>> write will fail if it either does not comply with the guardrail OR does not
>> comply with the constraint. The CEP touched this as well, stating that
>> guardrails take preference over defined constraints in schemas, so no
>> matter what, these guardrails will always be respected.
>>
>> Thanks,
>> Bernardo
>>
>> On Feb 8, 2025, at 6:09 PM, Dinesh Joshi  wrote:
>>
>> Guardrails and constraints serve distinct purposes. Guardrails allow the
>> operator to define reasonable bounds while constraints allow the developer
>> to do the same in the schema. However the constraint should not violate the
>> max bound of the guardrail. For example, if an operator defines the max
>> size of a column to be 1MiB then a constraint in the schema cannot go
>> beyond this max size limit. This allows the operator to define reasonable
>> limits while allowing the developer control over their application’s limits.
>>
>> On Sat, Feb 8, 2025 at 12:03 PM Bernardo Botella <
>> conta...@bernardobotella.com> wrote:
>>
>>> Hi everyone,
>>>
>>> After Constraints framework was merged in, I would like to come back to
>>> the discussion Jordan brought up in this Jira:
>>> https://issues.apache.org/jira/browse/CASSANDRA-19677
>>>
>>> For context, that Jira ticket (and PR) is adding a bunch of more fine
>>> grained size thresholds for column types using guardrails, expanding on
>>> what these Jiras added:
>>> https://issues.apache.org/jira/browse/CASSANDRA-17151
>>> https://issues.apache.org/jira/browse/CASSANDRA-17150
>>>
>>> Now, we have an alternative way to set sizes to scpecific columns using
>>> constraints (we have LENGHT constraint, which is technically different, but
>>> adding a SIZE constraint is on the roadmap and straight forward).
>>>
>>> Jordan raised a really valid concern that these new guardrails may be
>>> adding some noise to an already crowded space such as settings. On the
>>> other hand, these guardrails operate at a different level than constraints,
>>> as they are generic as opposed to column specific.
>>>
>>> We would like to hear what the community think in this case. Should
>>> these guardrails go in? Or do we drop them in favour of plain constraints?
>>>
>>> My two cents: My opinion is that these guardails still add value and
>>> help operators a more fine grained control to "protect" the database.
>>>
>>> Regards,
>>> Bernardo
>>
>>
>>
>


Re: [DISCUSS] NOT_NULL constraint vs STRICTLY_NOT_NULL constraint

2025-02-10 Thread Yifan Cai
While LOOSE_NOT_NULL might improve the clarity a bit, what is the value of such 
constraint provides to users? It still permits null. Meanwhile, it is easier to 
check the nullness of the bound values on the application side.
IMO, what benefits users is a way to ensure no null value can exist for the 
constrained columns. Reading the thread, it is the behavior of the strict 
version.
How about we just drop the LOOSE one and call the STRICT one “NOT_NULL”?

- Yifan

From: Bernardo Botella 
Sent: Monday, February 10, 2025 8:44:13 AM
To: dev@cassandra.apache.org 
Subject: Re: [DISCUSS] NOT_NULL constraint vs STRICTLY_NOT_NULL constraint

To recap,

The sentiment I am getting is that NOT_NULL allowing null values is too 
confusing. Nice, that’s why we started the thread.

As an alternative, instead of ditching the loose not null constraint, I propose 
we change the “default” behavior. From my initial proposal, I suggest renaming 
the Constraints:
- NOT_NULL -> LOOSE_NOT_NULL
- STRICTLY_NOT_NULL -> NOT_NULL

The reasoning behind trying to keep it is:
- It is already implemented.
- By being explicit with it being loose, we avoid the confusion of allowing 
nulls.
- It still adds value on its own.

With, the “by default” not null doesn’t allow null or non present values on the 
insert statement, while we still support the more relaxed LOOSE_NOT_NULL for 
updates.

Thoughts?


On Feb 10, 2025, at 8:29 AM, Štefan Miklošovič  wrote:



On Mon, Feb 10, 2025 at 5:20 PM Dinesh Joshi 
mailto:djo...@apache.org>> wrote:
In my head NOT_NULL constraint implies that the column must be specified on 
each write and must not be NULL. If a column with the NOT_NULL constraint is 
omitted during a write then shouldn’t it be treated as if it was specified and 
set to NULL?

Well, yes. One may also look at it that way. But then we would end up with 
"null" in a column, while it would be quite surprising for users to see that 
because they were thinking that if they specified it as NOT NULL on a table 
creation, then it is "guaranteed" that it will not be null ever again. It just 
looks strange to say in table schema it is not null but then it actually might 
be.


If the column has a non-NULL value that was previously written and you’re 
updating the rest of the columns, you still have to force the user to specify 
it otherwise you will have to perform a read before write to validate that the 
column was not NULL. I think this is a fine compromise given that the goal here 
is to ensure that an application shouldn’t inadvertently write a NULL value for 
a column specified as NOT_NULL.


Yes. I see it the same way.

On Mon, Feb 10, 2025 at 6:50 AM Bernardo Botella 
mailto:conta...@bernardobotella.com>> wrote:
Hi everyone,

Stefan Miklosovic and I have been working on a NOT_NULL 
(https://github.com/apache/cassandra/pull/3867) constraint to be added to the 
constraints tool belt, and a really interesting conversation came up.

First, as a problem statement, let's consider this:

-
CREATE TABLE ks.tb2 (
id int,
cl1 int,
cl2 int,
val text CHECK NOT_NULL(val),
PRIMARY KEY (id, cl1, cl2)
)

cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, null);
InvalidRequest: Error from server: code=2200 [Invalid query] message="Column 
value does not satisfy value constraint for column 'val' as it is null."

cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, 
“text");
cassandra@cqlsh> select * from ks.tb2;

 id | cl1 | cl2 | val
+-+-+--
  1 |   2 |   3 | text

(1 rows)
cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2) VALUES ( 1, 2, 4);
cassandra@cqlsh> select * from ks.tb2;

 id | cl1 | cl2 | val
+-+-+--
  1 |   2 |   3 | text
  1 |   2 |   4 | null

-

As you see, we have a hole in which a 'null' value is getting written on column 
val even if we have a NOT_NULL on that particular column whenever the column is 
NOT specified on the write. That raises the question on how this particular 
constraint should behave.

If we consider the other constraints (scalar constraint and length constraint 
so far), this particular behavior is fine. But, if the constraint is NOT_NULL, 
then it becomes a little bit trickier.

The conclusions we have reached is that the meaning of constraints should be 
interpreted like: I check whatever you give me as part of the write, ignoring 
everything else. Let me elaborate:
If we decide to treat this particular NOT_NULL constraint differently, and 
check if the value for that column is present in the insert statement, we then 
open a different can of worms. What happens if the row already exists with a 
valid value, and that insert statement is only trying to do an update to a 
different column in the row? If that was the case, we would be forcing the user 
to specify the 'val' column value for every update, even if it i

  1   2   >