RE: Re: Is there appetite to maintain the gocql driver (in the drivers subproject) ?

2024-05-02 Thread Rolo, Carlos via dev
Hello,

We are serious looking at picking up this driver here at Instaclustr. I will 
keep this thread updated.

On 2024/04/15 10:31:49 Mick Semb Wever wrote:
> We can open an issue with LEGAL to see what they say at least?
> >>
> >
> >
> > I will raise a LEGAL ticket.
> >
> > My question here is whether we have to go through the process of
> > best-efforts to get approval to donate (transfer copyright), or whether we
> > can honour the copyright on the prior work and move forward ( by
> > referencing it in our NOTICE.txt, as per
> > https://infra.apache.org/licensing-howto.html )
> >
>
>
> https://issues.apache.org/jira/browse/LEGAL-674
>


Carlos Rolo
Manager
Open Source Contributions

NetApp
carlos.r...@netapp.com

[cid:e306da79-20cb-4a62-a8c2-066e2bcbc580]


Re: [DISCUSS] Request for feedback re: new large features to (potentially) be added to the Apache Cassandra drivers still maintained by DataStax

2024-07-17 Thread Rolo, Carlos via dev
Hello!

Thanks for this! Would you release the results of the poll? That would be 
helpful since we are going to contribute to GoCQL once the donation is in place.

Cheers,

Carlos

From: Bret McGuire 
Sent: 17 July 2024 00:05
To: dev@cassandra.apache.org 
Subject: [DISCUSS] Request for feedback re: new large features to (potentially) 
be added to the Apache Cassandra drivers still maintained by DataStax

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



   Greetings all!  As we get closer and closer to the release of Apache 
Cassandra 5.0 the drivers team at DataStax would like to solicit feedback from 
the Cassandra community about what major features users would like to see 
across the various actively maintained drivers.  We’re considering focusing on 
one of the following:

* Robust support for vectors across all drivers [1]
* Robust support for protocol version 5 across all drivers [2]

   To be clear, we’re discussing adding these features to the drivers that are 
actively maintained by DataStax.  
CEP-8
 discusses the set of drivers to be donated by DataStax to The Apache Software 
Foundation (ASF).  Of these the Java driver has already been donated and the 
Ruby and PHP drivers are in maintenance mode, so for this conversation we’re 
talking about adding a new feature to the Python, node.js, C/C++ and C# 
drivers.  We’re also happy to work with the ASF to keep these changes in sync 
with similar functionality in the Java driver.  Finally, we’ll look into 
extending support to gocql once it has been donated to the ASF.

   I’ve setup a poll 
here
 so please let us know which feature above you’d like to see in a Cassandra 
driver near you!

   - Bret -


[1] Vectors are currently only supported by the Java, Python and node.js 
drivers.  For each of these drivers only float subtypes are supported.
[2] Complete support for the v5 protocol is currently only provided by the Java 
driver


Re: GoCQL module name change and next release

2024-10-14 Thread Rolo, Carlos via dev
I agree that should be 1) and 2) together.

Regarding the PR review for 2.0, I can look into some (I'm not proficient with 
Go, it has been a long time) but for triaging what should be in a major I can 
surely help. Low hanging fruit should also not be a problem for me to review.

Are we looking to review/triaging all existing PRs?

Thanks,

Carlos

From: João Reis 
Sent: 11 October 2024 16:03
To: dev@cassandra.apache.org 
Subject: Re: GoCQL module name change and next release

You don't often get email from joaor...@apache.org. Learn why this is 
important
EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



I like that too, I initially considered proposing that but thought I'd propose 
delivering v5 and vector support faster but +1 on making these two features a 
part of 2.0.0 instead.

Štefan Miklošovič mailto:smikloso...@apache.org>> 
escreveu (quinta, 10/10/2024 à(s) 23:37):

I would do 1) and 2) together. I would rename the module in 2.0.0 and there 
would be v5 and vector support as well. It would motivate us to get v5 and 
vector out while using that opportunity to rename it to 2.0.0.

On Thu, Oct 10, 2024 at 11:42 AM João Reis 
mailto:joaor...@apache.org>> wrote:
Hi,

Following the GoCQL donation we need to change the module name so it matches 
the new repo URL. Currently users have to keep using the old name unless they 
add a rewrite to go.mod.

There was a discussion on what the approach should be on #1776 [1] and I've 
created CASSANDRA-19993 [2] to track this work. Since this is a breaking change 
(it requires users to modify their imports or add a rewrite to go.mod) we need 
to bump the major version when we do this.

With this in mind, there's some topics to discuss:

1) When do we want to make this module name change happen? We can keep doing 
minor releases under the old module name but it is a bit confusing for new 
users to have to import gocql using a Github repo URL that effectively no 
longer exists. Also the amount of users that will be impacted by the module 
name change will only increase the longer we wait.

2) Should we take this opportunity to include other breaking change related 
issues with the module name change? Martin mentioned on the gocql mailing list 
[3] that there's a few issues on Github that are tagged with "semver-major" [4] 
and these should be considered for a new major release.

My take on these topics is that we should work on some of those tagged issues 
when we decide to change the module name as long as these breaking changes 
don't require users to significantly rewrite parts of their application. We 
should make this upgrade and module name change to be the least intrusive as 
possible for users. Note that the cassandra-gocql-driver project officially 
maintains the latest release only (single active branch) so doing a major 
release effectively drops support for the previous major immediately which 
means we have an even stronger incentive to make the upgrade as easy as 
possible.

I'm planning on reviewing protocol v5 and vector support PRs soon and we should 
probably make these 2 contributions available to users as soon as possible. To 
do this we can do a minor release before we start working on the next major. 
I'm also open to the idea that we could postpone the major release development 
and keep doing minor releases for a little longer under the old module name.

In summary, my proposal for a short to medium term roadmap is:

1) Release 1.8.0 with v5 and vector support (and potentially other small PRs, 
I've only looked at these 2 issues yet)
2) Release 2.0.0 with the module name change, some (if not all) of the 
"semver-major" tagged issues and other contributions

Let me know your thoughts,
João Reis

[1] https://github.com/apache/cassandra-gocql-driver/issues/1776
[2] https://issues.apache.org/jira/browse/CASSANDRA-19993
[3] https://groups.google.com/g/gocql/c/v0FruczBb2w/m/7Hc3_W9QCgAJ
[4] 
https://github.com/apache/cassandra-gocql-driver/issues?q=is%3Aopen+is%3Aissue+label%3Asemver-major


Re: [VOTE] Release Apache Cassandra Gocql Driver 1.7.0

2024-09-30 Thread Rolo, Carlos via dev
My vote doesn't count, but this would be a +1 from me.

From: Brandon Williams 
Sent: 30 September 2024 14:06
To: dev@cassandra.apache.org 
Subject: Re: [VOTE] Release Apache Cassandra Gocql Driver 1.7.0

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments




+1

Kind Regards,
Brandon

On Thu, Sep 26, 2024 at 3:46 PM Bret McGuire  wrote:
>
>Greetings all!
>
>I’m proposing the Cassandra Gocql Driver 1.7.0 for release.
>
> sha1: 953e0df999cabb3f5eef714df9921c00e9f632c2
> git: 
> https://github.com/apache/cassandra-gocql-driver/tree/v1.7.0-rc1
>
>The Source release is available here:
> https://dist.apache.org/repos/dist/dev/cassandra/cassandra-gocql-driver/1.7.0/
>
>This is the first release of the Gocql Driver since its donation.  
> Developers will include this driver in their projects by specifying a commit 
> hash or tag in go.mod rather than via the inclusion of binary artifacts so 
> we’ve avoided the creation of binary artifacts completely.  To avoid any 
> premature access to this release before the vote is complete we’ve 
> temporarily used the tag “v1.7.0-rc1” to clearly indicate that this tag 
> points to a release candidate.  Once the vote passes this tag will be updated 
> to “v1.7.0”.
>
>The vote will be open for 72 hours (longer if needed). Everyone who has 
> tested the build is invited to vote. Votes by PMC members are considered 
> binding. A vote passes if there are at least three binding +1s and no -1's.
>
>Thanks all!
>
>   - Bret -


Re: Looking for Cassandra Forward topics and speakers

2025-01-30 Thread Rolo, Carlos via dev
Hello Patrick,

Count me in!

I would like to pick either

 - CQL Management API: Can we just celebrate the end of JMX hell
potentially? Who can talk about this?

Or

 - SAI enhancements were recently highlighted by Caleb on the ML. This
isn't just a one-and-done feature, and its future is really cool.

What would be your timeline for this? Option 2 sounds great to me!

Cheers,

Carlos



From: Patrick McFadin 
Sent: 29 January 2025 19:48
To: dev 
Subject: Looking for Cassandra Forward topics and speakers

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments




Hi everyone,

A couple of years ago, I organized a Cassandra Forward event to get
people excited about the next version of Cassandra. It's time to ramp
up the excitement about one of the more consequential releases of
Cassandra: 5.1 or 6. Whatever we land on as a version will impact the
community with the addition of ACID transactions.

There is a lot more to talk about, so my plan is to create a nicely
rounded menu of topics that showcase the velocity of our new features.

Format: Online and prerecorded.
Date: First week of March

My main issue is needing speakers. Here's the topic list (please feel
free to comment on this)

 - Accord: I can cover this like I have been. No need for a speaker there.

 - TCM: I think the most important thing nobody knows about with
future Cassandra.

 - Sidecar (Spark jobs, Live migration) #2 on my list of "things you
should know about Cassandra but don't."  This project has been in the
shadows too long and I think users will love it.

 - Cassandra and Kubernetes: Not especially new, but certainly ramping
up. This would be great topic for an end user to discuss. Share the
good and bad.

 - CQL Management API: Can we just celebrate the end of JMX hell
potentially? Who can talk about this?

 SAI enhancements were recently highlighted by Caleb on the ML. This
isn't just a one-and-done feature, and its future is really cool.

This you?

I DON'T HAVE TIME FOR ANY OF THIS!  Here are some options:

  - If you want to give a talk but don't want to deal with the
logistics of recording, I can get you on Zoom and record it.
 - Don't have time to create the content for a talk? I can get you in
a Zoom and do an interview style recording. 30-60 minutes of your
time.
 - Can't get permission to talk? Let's find somebody who can give the
talk, and then we can work together to ensure the content is right.
 - I will feed java files into ChatGPT about the feature you love and
present it like a boss.

I recommend choices 1-3.

Thanks, everyone. I appreciate your time if you got this far.

Patrick


Re: Wild card search | Cassandra

2025-04-30 Thread Rolo, Carlos via dev
That would be in the next release of SAI:

[CASSANDRA-18493] SAI - LIKE prefix/suffix support - ASF 
JIRA

From: manish sharma 
Sent: 24 April 2025 18:31
To: dev@cassandra.apache.org 
Subject: Wild card search | Cassandra

[You don't often get email from mannu2...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments




Hi Team,
I am wondering if there is any roadmap or existing functionality to perform 
like queries with infix or prefix as like ‘%cat’ or ‘c%at’ without using solr 
or elastic search.

Appreciate your response!

Regards,
Manish
Sent from my iPhone


Re: [DISCUSS] Requirement to document features before releasing them

2025-05-01 Thread Rolo, Carlos via dev
I am bit out of the loop on how/if this would extend to driver sub-projects.

Because this makes 100% sense, and in the driver space as well. Looking into 
Java driver docs and making others similar would be a great.

Patrich that LLM suggestion might be a life saver, let me try that!

From: Miklosovic, Stefan via dev 
Sent: 01 May 2025 08:07
To: David Capwell ; dev@cassandra.apache.org 

Cc: Miklosovic, Stefan 
Subject: Re: [DISCUSS] Requirement to document features before releasing them

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments




Denser is better. In your oversimplified example of Accord, as a user who 
encounters this for the first time, I am definitely interested in what the 
limitations are. What might happen quite easily is that if it is not dense and 
we just announce it sparsly, then a user takes it all at face value and if it 
starts to diverge from your proclamation then they might feel like they were 
lied to or they start to be disappointed. You got me? Users do not like 
surprises they are discovering themselves on the way of trying it out (and a 
lot of time painfully). They just want to know what they are buying themselves 
into.



If there are super-cornercase details, that might be omitted as we have other 
channels of the communication (Slack, mailing list ...) but in general I do not 
see how a lot of documentation would be bad.



It also depends on who you are writing that documentation to. As said, we talk 
about user-facing docs here. A documentation for developers where we are trying 
to boostrap them / to make them oriented in the code base is going to be 
substantially different from a user-facing one.





From: David Capwell 
Date: Wednesday, 30 April 2025 at 23:35
To: dev@cassandra.apache.org 
Cc: Miklosovic, Stefan 
Subject: Re: [DISCUSS] Requirement to document features before releasing them

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



I wonder at what level can we enforce this.  What I mean, in modeling testing I 
have found some odd behaviors that people were not aware of (BATCH cell 
resolution, NULL handling (emptiness…..), etc.)… so if documentation is dense 
this can help force people to think through edge cases or how 2 features 
interact with each other…. If documentation is sparse, then you loose this 
benefit…



Simple example for Accord



# Sparse



Multiple key transaction support, bringing Apache Cassandra cluster to the RDMS 
world!



# Dense



…



Here are the current limitations, …



Here is where we alter Apache Cassandra’s behavior to be more inline with SQL, 
...



On Apr 30, 2025, at 1:38 PM, Miklosovic, Stefan via dev 
 wrote:





To extend the first e-mail to cover the practicalities:



  1.  changes introduced to nodetool would not be part of this because they are 
self-documented (docs of help is autogenerated)
  2.  introduction of changes into cassandra.yaml is already covered as that is 
what is autogenerated / on website also.
  3.  Applying common sense, if it is just enough to mention in NEWS.txt, that 
is also fine.
  4.  metrics - I bet there are some which are not documented, we should find a 
way how to autogenerate them into the website.



I am also to blame and showing I am not a hypocrite, I have never delivered 
in-depth user documentation of CEP-24 with examples, use cases, and so on. I am 
trying to be more aware of the documentation when delivering features, to raise 
awareness about that etc. It is easy to not think about this too much when 
developers are in a rush and similar. If there was a hard requirement for the 
documentation, I would do it right away and I would not need to deal with this 
now.



I understand that when delivering heavy-weights like CEP-15 we can not expect 
that all the docs will be done upon delivery but I want to stress the fact that 
providing usable documentation should be definitely something to think about 
when releasing it. Same goes for all other non-trivial features.





From: Josh McKenzie mailto:jmcken...@apache.org>>
Date: Wednesday, 30 April 2025 at 22:11
To: dev mailto:dev@cassandra.apache.org>>
Cc: Miklosovic, Stefan 
mailto:stefan.mikloso...@netapp.com>>
Subject: Re: [DISCUSS] Requirement to document features before releasing them

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



This makes intuitive sense to me.



In our case we could tie documentation to the process of promoting a feature 
from “experimental” to production ready, though I fear that might leave wiggle 
room for primary authors of some features to leave them as experimental 
forever, not desiring to take on the burden of documenting something that’s 
already merged in and usable by experts.



Curious what others think.



On Wed, Apr 30, 2025, at 12:10 PM, Miklosovic, Stefan via dev wrote:

I am on OpenSearchCon and there was a discussion about the documentation of 
features. In a nutshell, the policy they seem to 

Re: [DISCUSS] Requirement to document features before releasing them

2025-05-01 Thread Rolo, Carlos via dev
My view from the "balcony" here:

I think, some documentation is better that no documentation. If I would have to 
start in Cassandra right now, I would have a massive challenge. The project 
evolved a lot, documentation is not great, and the community is not the 
20/30ish people of the past that where in slack most of the time.

Documentation helps people getting in the project and grow the community. Not 
everyone is open to get into slack and ask questions.

From what I know, if someone votes against (of the committers) changes have to 
be done. And if documentation is a consideration, my concern is documentation 
would be on a couple of committers that would vote against? And if that person 
wouldn't vote, the feature could pass without proper docs?

So, having some sort of rule would be an improvement? But I leave this to the 
people on the "dance floor" to decide.

@Jon, @Patrick I've tested this with a driver PR from my team, and it works 
wonders!

Cheers,

Carlos

From: Brandon Williams 
Sent: 01 May 2025 15:51
To: dev@cassandra.apache.org 
Subject: Re: [DISCUSS] Requirement to document features before releasing them

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



On Thu, May 1, 2025 at 7:37 AM Benedict 
mailto:bened...@apache.org>> wrote:
I am opposed to this. There’s too much imprecision in the “rule” while 
simultaneously being much too rigid, and it will be improperly enforced (we 
already have lots of rule breaking around modifying public APIs, that should 
have discuss threads and do not, for instance). This kind of arbitrary rule 
that is unaligned with contributors will likely lead to a bad and inconsistent 
documentation, which is worse than no documentation.

I agree.  This is too strong a measure.  We have a documentation problem but 
trying to poorly litigate ourselves into fixing it isn't the best way.

We could perhaps stipulate that for a feature to leave experimental status the 
community must vote and that documentation should be a consideration. But this 
will only capture big changes.

We could perhaps try other ideas like moratoriums on contributions that are not 
documentation, to encourage improvements there.

We could perhaps try having LLMs generate documentation that new contributors 
could take a first pass at editing for correctness, before a committer takes a 
final pass.

I would support these ideas and others if we decide to try any of them out.  
I'm all for improving this situation.

Kind Regards,
Brandon



Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2025-03-04 Thread Rolo, Carlos via dev
Hello,

I would love to discuss this and provide feedback and design work for this. 
Since I'm not an experienced Java programmer I can't "hands-on" on the code. 
But I pick this up and try to carry it forward.

Carlos


From: guo Maxwell 
Sent: 26 February 2025 14:54
To: dev@cassandra.apache.org 
Subject: Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external 
storage locations

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



Is anyone else interested in continuing to discuss this topic?

guo Maxwell mailto:cclive1...@gmail.com>> 于2024年9月20日周五 
09:44写道:
I discussed this offline with Claude, he is no longer working on this.

It's a pity. I think this is a very valuable thing. Commitlog's archiving and 
restore may be able to use the relevant code if it is completed.

Patrick McFadin mailto:pmcfa...@gmail.com>>于2024年9月20日 
周五上午2:01写道:
Thanks for reviving this one!

On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell 
mailto:cclive1...@gmail.com>> wrote:
Is there any update on this topic?  It seems that things can make a big 
progress if  Jake Luciani  can find someone who can make the FileSystemProvider 
code accessible.

Jon Haddad mailto:j...@jonhaddad.com>> 于2023年12月16日周六 
05:29写道:
At a high level I really like the idea of being able to better leverage cheaper 
storage especially object stores like S3.

One important thing though - I feel pretty strongly that there's a big, deal 
breaking downside.   Backups, disk failure policies, snapshots and possibly 
repairs would get more complicated which haven't been particularly great in the 
past, and of course there's the issue of failure recovery being only partially 
possible if you're looking at a durable block store paired with an ephemeral 
one with some of your data not replicated to the cold side.  That introduces a 
failure case that's unacceptable for most teams, which results in needing to 
implement potentially 2 different backup solutions.  This is operationally 
complex with a lot of surface area for headaches.  I think a lot of teams would 
probably have an issue with the big question mark around durability and I 
probably would avoid it myself.

On the other hand, I'm +1 if we approach it something slightly differently - 
where _all_ the data is located on the cold storage, with the local hot storage 
used as a cache.  This means we can use the cold directories for the complete 
dataset, simplifying backups and node replacements.

For a little background, we had a ticket several years ago where I pointed out 
it was possible to do this *today* at the operating system level as long as 
you're using block devices (vs an object store) and LVM [1].  For example, this 
works well with GP3 EBS w/ low IOPS provisioning + local NVMe to get a nice 
balance of great read performance without going nuts on the cost for IOPS.  I 
also wrote about this in a little more detail in my blog [2].  There's also the 
new mount point tech in AWS which pretty much does exactly what I've suggested 
above [3] that's probably worth evaluating just to get a feel for it.

I'm not insisting we require LVM or the AWS S3 fs, since that would rule out 
other cloud providers, but I am pretty confident that the entire dataset should 
reside in the "cold" side of things for the practical and technical reasons I 
listed above.  I don't think it massively changes the proposal, and should 
simplify things for everyone.

Jon

[1] 
https://rustyrazorblade.com/post/2018/2018-04-24-intro-to-lvm/
[2] 
https://issues.apache.org/jira/browse/CASSANDRA-8460
[3] 
https://aws.amazon.com/about-aws/whats-new/2023/03/mountpoint-amazon-s3/


On Thu, Dec 14, 2023 at 1:56 AM Claude Warren 
mailto:cla...@apache.org>> wrote:
Is there still interest in this?  Can we get some points down on electrons so 
that we all understand the issues?

While it is fairly simple to redirect the read/write to something other  than 
the local system for a single node this will not solve the problem for tiered 
storage.

Tiered storage will require that on read/write the primary key be assessed and 
determine if the read/write should be redirected.  My reasoning for this 
statement is that in a cluster with a replication factor greater than 1 the 
node will store data for the keys that would be allocated to it in a cluster 
with a replication factor = 1, as well as some keys from nodes earlier in