Re: Is there appetite to maintain the gocql driver (in the drivers subproject) ?

2024-05-15 Thread Mick Semb Wever
>
> Ok, so we're got confidence now on how to approach this, confirmation from
>> the project's maintainers supporting it, and interest from a handful of
>> people interested in maintaining and contributing to the project.
>>
>
> Did you talk to the current maintainers off list or did I miss some thread
> where the maintainers indicated their support in maintaining this project?
>



Yes Dinesh.   João Reis managed to get hold of both Chris and Martin.
Responses have been slow, but everyone is on board.  This is not to be
considered a hostile fork, despite in all likelihood not being able to do a
full IP donation.


Re: Is there appetite to maintain the gocql driver (in the drivers subproject) ?

2024-05-15 Thread Dinesh Joshi
On Wed, May 15, 2024 at 12:09 AM Mick Semb Wever  wrote:

> Yes Dinesh.   João Reis managed to get hold of both Chris and Martin.
>>> Responses have been slow, but everyone is on board.  This is not to be
>>> considered a hostile fork, despite in all likelihood not being able to do a
>>> full IP donation.
>>>
>>
Great! I have no concerns at this point.


Re: [DISCUSS] Adding support for BETWEEN operator

2024-05-15 Thread Josh McKenzie
> Is there a technical limitation that would prevent a range write that 
> functions the same way as a range tombstone, other than probably needing a 
> version bump of the storage format?
The technical limitation would be cost/benefit due to how this intersects w/our 
architecture I think.

Range tombstones have taught us that something that should be relatively simple 
(merge in deletion mask at read time) introduces a significant amount of 
complexity on all the paths Benjamin enumerated with a pretty long tail of bugs 
and data incorrectness issues and edge cases. The work to get there, at a high 
level glance, would be:
 1. Updates to CQL grammar, spec
 2. Updates to write path
 3. Updates to accord. And thinking about how this intersects w/accord's WAL / 
logic (I think? Consider me not well educated on details here)
 4. Updates to compaction w/consideration for edge cases on all the different 
compaction strategies
 5. Updates to iteration and merge logic
 6. Updates to paging logic
 7. Indexing
 8. repair, both full and incremental implications, support, etc
 9. the list probably goes on? There's always >= 1 thing we're not thinking of 
with a change like this. Usually more.
For all of the above we also would need unit, integration, and fuzz testing 
extensively to ensure the introduction of this new spanning concept on a write 
doesn't introduce edge cases where incorrect data is returned on merge.

All of which is to say: it's an interesting problem, but IMO given our 
architecture and what we know about the past of trying to introduce an 
architectural concept like this, the costs to getting something like this to 
production ready are pretty high.

To me the cost/benefit don't really balance out. Just my .02 though.

On Tue, May 14, 2024, at 2:50 PM, Benjamin Lerer wrote:
>> It would be a lot more constructive to apply our brains towards solving an 
>> interesting problem than pointing out all its potential flaws based on gut 
>> feelings.
> 
> It is not simply a gut feeling, Jon. This change impacts read, write, 
> indexing, storage, compaction, repair... The risk and cost associated with it 
> are pretty significant and I am not convinced at this point of its benefit.
> 
> Le mar. 14 mai 2024 à 19:05, Jon Haddad  a écrit :
>> Personally, I don't think that something being scary at first glance is a 
>> good reason not to explore an idea.  The scenario you've described here is 
>> tricky but I'm not expecting it to be any worse than say, SAI, which (the 
>> last I checked) has O(N) complexity on returning result sets with regard to 
>> rows returned.  We've also merged in Vector search which has O(N) overhead 
>> with the number of SSTables.  We're still fundamentally looking at, in most 
>> cases, a limited number of SSTables and some merging of values.
>> 
>> Write updates are essentially a timestamped mask, potentially overlapping, 
>> and I suspect potentially resolvable during compaction by propagating the 
>> values.  They could be eliminated or narrowed based on how they've 
>> propagated by using the timestamp metadata on the SSTable.
>> 
>> It would be a lot more constructive to apply our brains towards solving an 
>> interesting problem than pointing out all its potential flaws based on gut 
>> feelings.  We haven't even moved this past an idea.  
>> 
>> I think it would solve a massive problem for a lot of people and is 100% 
>> worth considering.  Thanks Patrick and David for raising this.
>> 
>> Jon
>> 
>> 
>> 
>> On Tue, May 14, 2024 at 9:48 AM Bowen Song via dev 
>>  wrote:
>>> __
>>> Ranged update sounds like a disaster for compaction and read performance.
>>> 
>>> Imagine compacting or reading some SSTables in which a large number of 
>>> overlapping but non-identical ranges were updated with different values. It 
>>> gives me a headache by just thinking about it.
>>> 
>>> Ranged delete is much simpler, because the "value" is the same tombstone 
>>> marker, and it also is guaranteed to expire and disappear eventually, so 
>>> the performance impact of dealing with them at read and compaction time 
>>> doesn't suffer in the long term.
>>> 
>>> 
>>> On 14/05/2024 16:59, Benjamin Lerer wrote:
 It should be like range tombstones ... in much worse ;-). A tombstone is a 
 simple marker (deleted). An update can be far more complex.  
 
 Le mar. 14 mai 2024 à 15:52, Jon Haddad  a écrit :
> Is there a technical limitation that would prevent a range write that 
> functions the same way as a range tombstone, other than probably needing 
> a version bump of the storage format?
> 
> 
> On Tue, May 14, 2024 at 12:03 AM Benjamin Lerer  wrote:
>> Range restrictions (>, >=, =<, < and BETWEEN) do not work on UPDATEs. 
>> They do work on DELETE because under the hood C* they get translated 
>> into range tombstones.
>> 
>> Le mar. 14 mai 2024 à 02:44, David Capwell  a écrit :
>>> I would also include in UPDATE… but yeah, <3 BETWEEN 

[DISCUSS] ccm as a subproject

2024-05-15 Thread Josh McKenzie
Right now ccm isn't formally a subproject of Cassandra or under governance of 
the ASF. Given it's an integral components of our CI as well as for local 
testing for many devs, and we now have more experience w/our muscle on IP 
clearance and ingesting / absorbing subprojects where we can't track down every 
single contributor to get an ICLA, seems like it might be worth revisiting the 
topic of donation of ccm to Apache.

For what it's worth, Sylvain originally and then DataStax after transfer have 
both been incredible and receptive stewards of the projects and repos, so this 
isn't about any response to any behavior on their part. Structurally, however, 
it'd be better for the health of the project(s) long-term to have ccm promoted 
in. As far as I know there was strong receptivity to that donation in the past 
but the IP clearance was the primary hurdle.

Anyone have any thoughts for or against?

https://github.com/riptano/ccm


Re: [DISCUSS] Adding support for BETWEEN operator

2024-05-15 Thread Jon Haddad
I was trying to have a discussion about a technical possibility, not a cost
benefit analysis.  More of a "how could we technically reach mars?"
discussion than a "how we get congress to authorize a budget to reach mars?"

Happy to talk about this privately with anyone interested as I enjoy a
technical discussion for the sake of a good technical discussion.

Thanks,
Jon

On Wed, May 15, 2024 at 7:18 AM Josh McKenzie  wrote:

> Is there a technical limitation that would prevent a range write that
> functions the same way as a range tombstone, other than probably needing a
> version bump of the storage format?
>
> The technical limitation would be cost/benefit due to how this intersects
> w/our architecture I think.
>
> Range tombstones have taught us that something that should be relatively
> simple (merge in deletion mask at read time) introduces a significant
> amount of complexity on all the paths Benjamin enumerated with a pretty
> long tail of bugs and data incorrectness issues and edge cases. The work to
> get there, at a high level glance, would be:
>
>1. Updates to CQL grammar, spec
>2. Updates to write path
>3. Updates to accord. And thinking about how this intersects
>w/accord's WAL / logic (I think? Consider me not well educated on details
>here)
>4. Updates to compaction w/consideration for edge cases on all the
>different compaction strategies
>5. Updates to iteration and merge logic
>6. Updates to paging logic
>7. Indexing
>8. repair, both full and incremental implications, support, etc
>9. the list probably goes on? There's always >= 1 thing we're not
>thinking of with a change like this. Usually more.
>
> For all of the above we also would need unit, integration, and fuzz
> testing extensively to ensure the introduction of this new spanning concept
> on a write doesn't introduce edge cases where incorrect data is returned on
> merge.
>
> All of which is to say: it's an interesting problem, but IMO given our
> architecture and what we know about the past of trying to introduce an
> architectural concept like this, the costs to getting something like this
> to production ready are pretty high.
>
> To me the cost/benefit don't really balance out. Just my .02 though.
>
> On Tue, May 14, 2024, at 2:50 PM, Benjamin Lerer wrote:
>
> It would be a lot more constructive to apply our brains towards solving an
> interesting problem than pointing out all its potential flaws based on gut
> feelings.
>
>
> It is not simply a gut feeling, Jon. This change impacts read, write,
> indexing, storage, compaction, repair... The risk and cost associated with
> it are pretty significant and I am not convinced at this point of its
> benefit.
>
> Le mar. 14 mai 2024 à 19:05, Jon Haddad  a écrit :
>
> Personally, I don't think that something being scary at first glance is a
> good reason not to explore an idea.  The scenario you've described here is
> tricky but I'm not expecting it to be any worse than say, SAI, which (the
> last I checked) has O(N) complexity on returning result sets with regard to
> rows returned.  We've also merged in Vector search which has O(N) overhead
> with the number of SSTables.  We're still fundamentally looking at, in most
> cases, a limited number of SSTables and some merging of values.
>
> Write updates are essentially a timestamped mask, potentially overlapping,
> and I suspect potentially resolvable during compaction by propagating the
> values.  They could be eliminated or narrowed based on how they've
> propagated by using the timestamp metadata on the SSTable.
>
> It would be a lot more constructive to apply our brains towards solving an
> interesting problem than pointing out all its potential flaws based on gut
> feelings.  We haven't even moved this past an idea.
>
> I think it would solve a massive problem for a lot of people and is 100%
> worth considering.  Thanks Patrick and David for raising this.
>
> Jon
>
>
>
> On Tue, May 14, 2024 at 9:48 AM Bowen Song via dev <
> dev@cassandra.apache.org> wrote:
>
>
> Ranged update sounds like a disaster for compaction and read performance.
>
> Imagine compacting or reading some SSTables in which a large number of
> overlapping but non-identical ranges were updated with different values. It
> gives me a headache by just thinking about it.
>
> Ranged delete is much simpler, because the "value" is the same tombstone
> marker, and it also is guaranteed to expire and disappear eventually, so
> the performance impact of dealing with them at read and compaction time
> doesn't suffer in the long term.
>
> On 14/05/2024 16:59, Benjamin Lerer wrote:
>
> It should be like range tombstones ... in much worse ;-). A tombstone is a
> simple marker (deleted). An update can be far more complex.
>
> Le mar. 14 mai 2024 à 15:52, Jon Haddad  a écrit :
>
> Is there a technical limitation that would prevent a range write that
> functions the same way as a range tombstone, other than probably needing a
> ve

Re: [DISCUSS] Adding support for BETWEEN operator

2024-05-15 Thread David Capwell
Thanks for the reply Benjamin, makes sense to me.  We can always add it later 
if it makes sense later, don’t need now in UPDATE

> On May 15, 2024, at 7:44 AM, Jon Haddad  wrote:
> 
> I was trying to have a discussion about a technical possibility, not a cost 
> benefit analysis.  More of a "how could we technically reach mars?" 
> discussion than a "how we get congress to authorize a budget to reach mars?"
> 
> Happy to talk about this privately with anyone interested as I enjoy a 
> technical discussion for the sake of a good technical discussion.
> 
> Thanks,
> Jon
> 
> On Wed, May 15, 2024 at 7:18 AM Josh McKenzie  wrote:
>> Is there a technical limitation that would prevent a range write that 
>> functions the same way as a range tombstone, other than probably needing a 
>> version bump of the storage format?
> The technical limitation would be cost/benefit due to how this intersects 
> w/our architecture I think.
> 
> Range tombstones have taught us that something that should be relatively 
> simple (merge in deletion mask at read time) introduces a significant amount 
> of complexity on all the paths Benjamin enumerated with a pretty long tail of 
> bugs and data incorrectness issues and edge cases. The work to get there, at 
> a high level glance, would be:
> • Updates to CQL grammar, spec
> • Updates to write path
> • Updates to accord. And thinking about how this intersects w/accord's 
> WAL / logic (I think? Consider me not well educated on details here)
> • Updates to compaction w/consideration for edge cases on all the 
> different compaction strategies
> • Updates to iteration and merge logic
> • Updates to paging logic
> • Indexing
> • repair, both full and incremental implications, support, etc
> • the list probably goes on? There's always >= 1 thing we're not thinking 
> of with a change like this. Usually more.
> For all of the above we also would need unit, integration, and fuzz testing 
> extensively to ensure the introduction of this new spanning concept on a 
> write doesn't introduce edge cases where incorrect data is returned on merge.
> 
> All of which is to say: it's an interesting problem, but IMO given our 
> architecture and what we know about the past of trying to introduce an 
> architectural concept like this, the costs to getting something like this to 
> production ready are pretty high.
> 
> To me the cost/benefit don't really balance out. Just my .02 though.
> 
> On Tue, May 14, 2024, at 2:50 PM, Benjamin Lerer wrote:
>> It would be a lot more constructive to apply our brains towards solving an 
>> interesting problem than pointing out all its potential flaws based on gut 
>> feelings.
>> 
>> It is not simply a gut feeling, Jon. This change impacts read, write, 
>> indexing, storage, compaction, repair... The risk and cost associated with 
>> it are pretty significant and I am not convinced at this point of its 
>> benefit.
>> 
>> Le mar. 14 mai 2024 à 19:05, Jon Haddad  a écrit :
>> Personally, I don't think that something being scary at first glance is a 
>> good reason not to explore an idea.  The scenario you've described here is 
>> tricky but I'm not expecting it to be any worse than say, SAI, which (the 
>> last I checked) has O(N) complexity on returning result sets with regard to 
>> rows returned.  We've also merged in Vector search which has O(N) overhead 
>> with the number of SSTables.  We're still fundamentally looking at, in most 
>> cases, a limited number of SSTables and some merging of values.
>> 
>> Write updates are essentially a timestamped mask, potentially overlapping, 
>> and I suspect potentially resolvable during compaction by propagating the 
>> values.  They could be eliminated or narrowed based on how they've 
>> propagated by using the timestamp metadata on the SSTable.
>> 
>> It would be a lot more constructive to apply our brains towards solving an 
>> interesting problem than pointing out all its potential flaws based on gut 
>> feelings.  We haven't even moved this past an idea.  
>> 
>> I think it would solve a massive problem for a lot of people and is 100% 
>> worth considering.  Thanks Patrick and David for raising this.
>> 
>> Jon
>> 
>> 
>> 
>> On Tue, May 14, 2024 at 9:48 AM Bowen Song via dev 
>>  wrote:
>> 
>> Ranged update sounds like a disaster for compaction and read performance.
>> Imagine compacting or reading some SSTables in which a large number of 
>> overlapping but non-identical ranges were updated with different values. It 
>> gives me a headache by just thinking about it.
>> Ranged delete is much simpler, because the "value" is the same tombstone 
>> marker, and it also is guaranteed to expire and disappear eventually, so the 
>> performance impact of dealing with them at read and compaction time doesn't 
>> suffer in the long term.
>> 
>> On 14/05/2024 16:59, Benjamin Lerer wrote:
>>> It should be like range tombstones ... in much worse ;-). A tombstone is a 
>>> simple marker (d

Re: [DISCUSS] Adding support for BETWEEN operator

2024-05-15 Thread Jeff Jirsa
You can remove the shadowed values at compaction time, but you can’t ever fully 
propagate the range update to point updates, so you’d be propagating all of the 
range-update structures throughout everything forever. It’s JUST like a range 
tombstone - you don’t know what it’s shadowing (and can’t, in many cases, 
because the width of the range is uncountable for some types). 

Setting aside whether or not this construct is worth adding (I suspect a lot of 
binding votes would say it’s not), the thread focuses on BETWEEN operator, and 
there’s no reason we should pollute the conversation of “add a missing SQL 
operator that basically maps to existing functionality” with creation of a 
brand new form of update that definitely doesn’t map to any existing concepts. 





> On May 14, 2024, at 10:05 AM, Jon Haddad  wrote:
> 
> Personally, I don't think that something being scary at first glance is a 
> good reason not to explore an idea.  The scenario you've described here is 
> tricky but I'm not expecting it to be any worse than say, SAI, which (the 
> last I checked) has O(N) complexity on returning result sets with regard to 
> rows returned.  We've also merged in Vector search which has O(N) overhead 
> with the number of SSTables.  We're still fundamentally looking at, in most 
> cases, a limited number of SSTables and some merging of values.
> 
> Write updates are essentially a timestamped mask, potentially overlapping, 
> and I suspect potentially resolvable during compaction by propagating the 
> values.  They could be eliminated or narrowed based on how they've propagated 
> by using the timestamp metadata on the SSTable.
> 
> It would be a lot more constructive to apply our brains towards solving an 
> interesting problem than pointing out all its potential flaws based on gut 
> feelings.  We haven't even moved this past an idea.  
> 
> I think it would solve a massive problem for a lot of people and is 100% 
> worth considering.  Thanks Patrick and David for raising this.
> 
> Jon
> 
> 
> 
> On Tue, May 14, 2024 at 9:48 AM Bowen Song via dev  > wrote:
>> Ranged update sounds like a disaster for compaction and read performance.
>> 
>> Imagine compacting or reading some SSTables in which a large number of 
>> overlapping but non-identical ranges were updated with different values. It 
>> gives me a headache by just thinking about it.
>> 
>> Ranged delete is much simpler, because the "value" is the same tombstone 
>> marker, and it also is guaranteed to expire and disappear eventually, so the 
>> performance impact of dealing with them at read and compaction time doesn't 
>> suffer in the long term.
>> 
>> 
>> On 14/05/2024 16:59, Benjamin Lerer wrote:
>>> It should be like range tombstones ... in much worse ;-). A tombstone is a 
>>> simple marker (deleted). An update can be far more complex.  
>>> 
>>> Le mar. 14 mai 2024 à 15:52, Jon Haddad >> > a écrit :
 Is there a technical limitation that would prevent a range write that 
 functions the same way as a range tombstone, other than probably needing a 
 version bump of the storage format?
 
 
 On Tue, May 14, 2024 at 12:03 AM Benjamin Lerer >>> > wrote:
> Range restrictions (>, >=, =<, < and BETWEEN) do not work on UPDATEs. 
> They do work on DELETE because under the hood C* they get translated into 
> range tombstones.
> 
> Le mar. 14 mai 2024 à 02:44, David Capwell  > a écrit :
>> I would also include in UPDATE… but yeah, <3 BETWEEN and welcome this 
>> work.
>> 
>>> On May 13, 2024, at 7:40 AM, Patrick McFadin >> > wrote:
>>> 
>>> This is a great feature addition to CQL! I get asked about it from time 
>>> to time but then people figure out a workaround. It will be great to 
>>> just have it available. 
>>> 
>>> And right on Simon! I think the only project I had as a high school 
>>> senior was figuring out how many parties I could go to and still 
>>> maintain a passing grade. Thanks for your work here. 
>>> 
>>> Patrick 
>>> 
>>> On Mon, May 13, 2024 at 1:35 AM Benjamin Lerer >> > wrote:
 Hi everybody,
 
 Just raising awareness that Simon is working on adding support for the 
 BETWEEN operator in WHERE clauses (SELECT and DELETE) in 
 CASSANDRA-19604. We plan to add support for it in conditions in a 
 separate patch.
 
 The patch is available.
 
 As a side note, Simon chose to do his highschool senior project 
 contributing to Apache Cassandra. This patch is his first contribution 
 for his senior project (his second feature contribution to Apache 
 Cassandra).
 
 
>> 



Re: [DISCUSS] ccm as a subproject

2024-05-15 Thread Bret McGuire
   Speaking only for myself I _love_ this idea.  The various drivers use
ccm extensively in their integration test suites so having this tool
in-house and actively looked after would be very beneficial for our work.

   - Bret -

On Wed, May 15, 2024 at 9:23 AM Josh McKenzie  wrote:

> Right now ccm isn't formally a subproject of Cassandra or under governance
> of the ASF. Given it's an integral components of our CI as well as for
> local testing for many devs, and we now have more experience w/our muscle
> on IP clearance and ingesting / absorbing subprojects where we can't track
> down every single contributor to get an ICLA, seems like it might be worth
> revisiting the topic of donation of ccm to Apache.
>
> For what it's worth, Sylvain originally and then DataStax after transfer
> have both been incredible and receptive stewards of the projects and repos,
> so this isn't about any response to any behavior on their part.
> Structurally, however, it'd be better for the health of the project(s)
> long-term to have ccm promoted in. As far as I know there was strong
> receptivity to that donation in the past but the IP clearance was the
> primary hurdle.
>
> Anyone have any thoughts for or against?
>
> https://github.com/riptano/ccm
>
>


Re: [DISCUSS] ccm as a subproject

2024-05-15 Thread Abe Ratnofsky
Strong supporter for bringing ccm into the project as well. ccm is necessary 
test infrastructure for multiple subprojects, and Cassandra committers should 
be able to make the changes to ccm that are necessary for their patches.

There's also the security angle: we should work to consolidate our dependencies 
where appropriate, and reduce the risk of supply chain antics.

> On May 15, 2024, at 4:25 PM, Bret McGuire  wrote:
> 
>Speaking only for myself I _love_ this idea.  The various drivers use ccm 
> extensively in their integration test suites so having this tool in-house and 
> actively looked after would be very beneficial for our work.
> 
>- Bret -
> 
> On Wed, May 15, 2024 at 9:23 AM Josh McKenzie  > wrote:
>> Right now ccm isn't formally a subproject of Cassandra or under governance 
>> of the ASF. Given it's an integral components of our CI as well as for local 
>> testing for many devs, and we now have more experience w/our muscle on IP 
>> clearance and ingesting / absorbing subprojects where we can't track down 
>> every single contributor to get an ICLA, seems like it might be worth 
>> revisiting the topic of donation of ccm to Apache.
>> 
>> For what it's worth, Sylvain originally and then DataStax after transfer 
>> have both been incredible and receptive stewards of the projects and repos, 
>> so this isn't about any response to any behavior on their part. 
>> Structurally, however, it'd be better for the health of the project(s) 
>> long-term to have ccm promoted in. As far as I know there was strong 
>> receptivity to that donation in the past but the IP clearance was the 
>> primary hurdle.
>> 
>> Anyone have any thoughts for or against?
>> 
>> https://github.com/riptano/ccm
>> 



Re: [DISCUSS] ccm as a subproject

2024-05-15 Thread Paulo Motta
As much as I'd like to remove the dependency on ccm I think we'll stick
with it for a bit, so +1 on moving under the project umbrella.

In the long term it would be nice to modernize integration test suites to
use containers instead of processes for more flexibility and fewer
dependencies for local development. Perhaps an incremental way to do that
would be to add a docker backend to ccm.

On Wed, May 15, 2024 at 4:25 PM Bret McGuire  wrote:

>Speaking only for myself I _love_ this idea.  The various drivers use
> ccm extensively in their integration test suites so having this tool
> in-house and actively looked after would be very beneficial for our work.
>
>- Bret -
>
> On Wed, May 15, 2024 at 9:23 AM Josh McKenzie 
> wrote:
>
>> Right now ccm isn't formally a subproject of Cassandra or under
>> governance of the ASF. Given it's an integral components of our CI as well
>> as for local testing for many devs, and we now have more experience w/our
>> muscle on IP clearance and ingesting / absorbing subprojects where we can't
>> track down every single contributor to get an ICLA, seems like it might be
>> worth revisiting the topic of donation of ccm to Apache.
>>
>> For what it's worth, Sylvain originally and then DataStax after transfer
>> have both been incredible and receptive stewards of the projects and repos,
>> so this isn't about any response to any behavior on their part.
>> Structurally, however, it'd be better for the health of the project(s)
>> long-term to have ccm promoted in. As far as I know there was strong
>> receptivity to that donation in the past but the IP clearance was the
>> primary hurdle.
>>
>> Anyone have any thoughts for or against?
>>
>> https://github.com/riptano/ccm
>>
>>


Re: [DISCUSS] ccm as a subproject

2024-05-15 Thread Bret McGuire
   Very much agreed Paulo; I was musing on the idea of adding Docker
support to ccm recently as well.  We'd want to preserve the current ability
to work with releases (and Github branches) but I very much like the idea
of adding Docker support as a new feature.

On Wed, May 15, 2024 at 3:56 PM Paulo Motta  wrote:

> As much as I'd like to remove the dependency on ccm I think we'll stick
> with it for a bit, so +1 on moving under the project umbrella.
>
> In the long term it would be nice to modernize integration test suites to
> use containers instead of processes for more flexibility and fewer
> dependencies for local development. Perhaps an incremental way to do that
> would be to add a docker backend to ccm.
>
> On Wed, May 15, 2024 at 4:25 PM Bret McGuire 
> wrote:
>
>>Speaking only for myself I _love_ this idea.  The various drivers use
>> ccm extensively in their integration test suites so having this tool
>> in-house and actively looked after would be very beneficial for our work.
>>
>>- Bret -
>>
>> On Wed, May 15, 2024 at 9:23 AM Josh McKenzie 
>> wrote:
>>
>>> Right now ccm isn't formally a subproject of Cassandra or under
>>> governance of the ASF. Given it's an integral components of our CI as well
>>> as for local testing for many devs, and we now have more experience w/our
>>> muscle on IP clearance and ingesting / absorbing subprojects where we can't
>>> track down every single contributor to get an ICLA, seems like it might be
>>> worth revisiting the topic of donation of ccm to Apache.
>>>
>>> For what it's worth, Sylvain originally and then DataStax after transfer
>>> have both been incredible and receptive stewards of the projects and repos,
>>> so this isn't about any response to any behavior on their part.
>>> Structurally, however, it'd be better for the health of the project(s)
>>> long-term to have ccm promoted in. As far as I know there was strong
>>> receptivity to that donation in the past but the IP clearance was the
>>> primary hurdle.
>>>
>>> Anyone have any thoughts for or against?
>>>
>>> https://github.com/riptano/ccm
>>>
>>>


Re: [DISCUSS] ccm as a subproject

2024-05-15 Thread David Capwell
Yes please!

> On May 15, 2024, at 2:23 PM, Bret McGuire  wrote:
> 
>Very much agreed Paulo; I was musing on the idea of adding Docker support 
> to ccm recently as well.  We'd want to preserve the current ability to work 
> with releases (and Github branches) but I very much like the idea of adding 
> Docker support as a new feature.
> 
> On Wed, May 15, 2024 at 3:56 PM Paulo Motta  > wrote:
>> As much as I'd like to remove the dependency on ccm I think we'll stick with 
>> it for a bit, so +1 on moving under the project umbrella.
>> 
>> In the long term it would be nice to modernize integration test suites to 
>> use containers instead of processes for more flexibility and fewer 
>> dependencies for local development. Perhaps an incremental way to do that 
>> would be to add a docker backend to ccm.
>> 
>> On Wed, May 15, 2024 at 4:25 PM Bret McGuire > > wrote:
>>>Speaking only for myself I _love_ this idea.  The various drivers use 
>>> ccm extensively in their integration test suites so having this tool 
>>> in-house and actively looked after would be very beneficial for our work.
>>> 
>>>- Bret -
>>> 
>>> On Wed, May 15, 2024 at 9:23 AM Josh McKenzie >> > wrote:
 Right now ccm isn't formally a subproject of Cassandra or under governance 
 of the ASF. Given it's an integral components of our CI as well as for 
 local testing for many devs, and we now have more experience w/our muscle 
 on IP clearance and ingesting / absorbing subprojects where we can't track 
 down every single contributor to get an ICLA, seems like it might be worth 
 revisiting the topic of donation of ccm to Apache.
 
 For what it's worth, Sylvain originally and then DataStax after transfer 
 have both been incredible and receptive stewards of the projects and 
 repos, so this isn't about any response to any behavior on their part. 
 Structurally, however, it'd be better for the health of the project(s) 
 long-term to have ccm promoted in. As far as I know there was strong 
 receptivity to that donation in the past but the IP clearance was the 
 primary hurdle.
 
 Anyone have any thoughts for or against?
 
 https://github.com/riptano/ccm