date:20250304

I don't say that using remote object storage is useless.

I am just saying that I don't see the difference. I have not measured that
but I can imagine that s3 mounted would use, under the hood, the same calls
to s3 api. How else would it be done? You need to talk to remote s3 storage
eventually anyway. So why does it matter if we call s3 api from Java or by
other means from some "s3 driver"?  It is eventually using same thing, no?

On Tue, Mar 4, 2025 at 12:47 PM Jeff Jirsa  wrote:

> Mounting an s3 bucket as a directory is an easy but poor implementation of
> object backed storage for databases
>
> Object storage is durable (most data loss is due to bugs not concurrent
> hardware failures), cheap (can 5-10x cheaper) and ubiquitous. A  huge
> number of modern systems are object-storage-only because the approximately
> infinite scale / cost / throughput tradeoffs often make up for the latency.
>
> Outright dismissing object storage for Cassandra is short sighted - it
> needs to be done in a way that makes sense, not just blindly copying over
> the block access patterns to object.
>
>
> On Mar 4, 2025, at 11:19 AM, Štefan Miklošovič 
> wrote:
>
> 
> I do not think we need this CEP, honestly. I don't want to diss this
> unnecessarily but if you mount a remote storage locally (e.g. mounting s3
> bucket as if it was any other directory on node's machine), then what is
> this CEP good for?
>
> Not talking about the necessity to put all dependencies to be able to talk
> to respective remote storage to Cassandra's class path, introducing
> potential problems with dependencies and their possible incompatibilities /
> different versions etc ...
>
> On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas 
> wrote:
>
>> I’d love to see this implemented — where “this” is a proxy for some
>> notion of support for remote object storage, perhaps usable by compaction
>> strategies like TWCS to migrate data older than a threshold from a local
>> filesystem to remote object.
>>
>> It’s not an area where I can currently dedicate engineering effort. But
>> if others are interested in contributing a feature like this, I’d see it as
>> valuable for the project and would be happy to collaborate on
>> design/architecture/goals.
>>
>> – Scott
>>
>> On Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:
>>
>> 
>> Is anyone else interested in continuing to discuss this topic?
>>
>> guo Maxwell  于2024年9月20日周五 09:44写道：
>>
>>> I discussed this offline with Claude, he is no longer working on this.
>>>
>>> It's a pity. I think this is a very valuable thing. Commitlog's
>>> archiving and restore may be able to use the relevant code if it is
>>> completed.
>>>
>>> Patrick McFadin 于2024年9月20日 周五上午2:01写道：
>>>
 Thanks for reviving this one!

 On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell 
 wrote:

> Is there any update on this topic?  It seems that things can make a
> big progress if  Jake Luciani  can find someone who can make the
> FileSystemProvider code accessible.
>
> Jon Haddad  于2023年12月16日周六 05:29写道：
>
>> At a high level I really like the idea of being able to better
>> leverage cheaper storage especially object stores like S3.
>>
>> One important thing though - I feel pretty strongly that there's a
>> big, deal breaking downside.   Backups, disk failure policies, snapshots
>> and possibly repairs would get more complicated which haven't been
>> particularly great in the past, and of course there's the issue of 
>> failure
>> recovery being only partially possible if you're looking at a durable 
>> block
>> store paired with an ephemeral one with some of your data not replicated 
>> to
>> the cold side.  That introduces a failure case that's unacceptable for 
>> most
>> teams, which results in needing to implement potentially 2 different 
>> backup
>> solutions.  This is operationally complex with a lot of surface area for
>> headaches.  I think a lot of teams would probably have an issue with the
>> big question mark around durability and I probably would avoid it myself.
>>
>> On the other hand, I'm +1 if we approach it something slightly
>> differently - where _all_ the data is located on the cold storage, with 
>> the
>> local hot storage used as a cache.  This means we can use the cold
>> directories for the complete dataset, simplifying backups and node
>> replacements.
>>
>> For a little background, we had a ticket several years ago where I
>> pointed out it was possible to do this *today* at the operating system
>> level as long as you're using block devices (vs an object store) and LVM
>> [1].  For example, this works well with GP3 EBS w/ low IOPS provisioning 
>> +
>> local NVMe to get a nice balance of great read performance without going
>> nuts on the cost for IOPS.  I also wrote about this in a little more 
>> detail
>> in my blog [2].  There's als

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread J. D. Jordan

🎉 On Mar 4, 2025, at 5:49 AM, Ekaterina Dimitrova  wrote:Congrats!!! 🎉 On Tue, 4 Mar 2025 at 6:11, Josh McKenzie  wrote:Congrats Aaron!On Tue, Mar 4, 2025, at 4:08 AM, Soheil Rahsaz wrote:Congratulations Aaron!On Tue, Mar 4, 2025 at 12:09 PM Paulo Motta  wrote:Congratulations Aaron, happy to see you recognized as a committer!Cheers,PauloOn Tue, 4 Mar 2025 at 03:26 Bernardo Botella  wrote:That’s awesome!!Congratulations Aaron!! Long overdue for sure!On Mon, Mar 3, 2025 at 16:25 Patrick McFadin  wrote:The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has accepted the invitation to become a committer!  Aaron has been tireless in his mission to help every single Cassandra operator on planet Earth. If you don't believe me, check out his Stack Overflow profile page: https://stackoverflow.com/users/1054558/aaron He's been a continuous speaker on Cassandra topics and is one of the coordinators for the Planet Cassandra meetup. Those are just the recent highlights.  Please join us in congratulating and welcoming Aaron.  The Apache Cassandra PMC members

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Aaron

Thank you everyone!

Aaron


On Tue, Mar 4, 2025 at 7:02 AM J. D. Jordan 
wrote:

> 🎉
>
> On Mar 4, 2025, at 5:49 AM, Ekaterina Dimitrova 
> wrote:
>
> 
> Congrats!!! 🎉
>
> On Tue, 4 Mar 2025 at 6:11, Josh McKenzie  wrote:
>
>> Congrats Aaron!
>>
>> On Tue, Mar 4, 2025, at 4:08 AM, Soheil Rahsaz wrote:
>>
>> Congratulations Aaron!
>>
>> On Tue, Mar 4, 2025 at 12:09 PM Paulo Motta  wrote:
>>
>> Congratulations Aaron, happy to see you recognized as a committer!
>>
>> Cheers,
>>
>> Paulo
>>
>> On Tue, 4 Mar 2025 at 03:26 Bernardo Botella <
>> conta...@bernardobotella.com> wrote:
>>
>> That’s awesome!!
>>
>> Congratulations Aaron!! Long overdue for sure!
>>
>>
>> On Mon, Mar 3, 2025 at 16:25 Patrick McFadin  wrote:
>>
>> The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
>> accepted the invitation to become a committer!
>>
>> Aaron has been tireless in his mission to help every single Cassandra
>> operator on planet Earth. If you don't believe me, check out his Stack
>> Overflow profile page: https://stackoverflow.com/users/1054558/aaron
>> He's been a continuous speaker on Cassandra topics and is one of the
>> coordinators for the Planet Cassandra meetup. Those are just the
>> recent highlights.
>>
>> Please join us in congratulating and welcoming Aaron.
>>
>> The Apache Cassandra PMC members
>>
>>

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

I would be very cautious about not "reinventing the wheel" here. Are we
confident that we implement it in such a way which is bug-free / robust
enough? Do we think we can do a better job than the authors of "s3 driver"?
If they did a good job (which I assume they did) then failing an operation
while writing to / reading from remote storage mounted locally _should_ be
ideally indistinguishable from dealing with it locally.

So, what are these differences exactly? Not asking you specifically but in
general. I think this all should be answered before doing this in order to
know what we are getting ourselves into.

Plus the disadvantages when it comes to putting all the dependencies to
classpath. I think we would never release it. We would at most expose an
API for others to integrate with and it would be their job to deal with all
the complexity.

On Tue, Mar 4, 2025 at 2:57 PM Brandon Williams  wrote:

> A failing remote api that you are calling and a failing filesystem you
> are using have different implications.
>
> Kind Regards,
> Brandon
>
> On Tue, Mar 4, 2025 at 7:47 AM Štefan Miklošovič 
> wrote:
> >
> > I don't say that using remote object storage is useless.
> >
> > I am just saying that I don't see the difference. I have not measured
> that but I can imagine that s3 mounted would use, under the hood, the same
> calls to s3 api. How else would it be done? You need to talk to remote s3
> storage eventually anyway. So why does it matter if we call s3 api from
> Java or by other means from some "s3 driver"?  It is eventually using same
> thing, no?
> >
> > On Tue, Mar 4, 2025 at 12:47 PM Jeff Jirsa  wrote:
> >>
> >> Mounting an s3 bucket as a directory is an easy but poor implementation
> of object backed storage for databases
> >>
> >> Object storage is durable (most data loss is due to bugs not concurrent
> hardware failures), cheap (can 5-10x cheaper) and ubiquitous. A  huge
> number of modern systems are object-storage-only because the approximately
> infinite scale / cost / throughput tradeoffs often make up for the latency.
> >>
> >> Outright dismissing object storage for Cassandra is short sighted - it
> needs to be done in a way that makes sense, not just blindly copying over
> the block access patterns to object.
> >>
> >>
> >> On Mar 4, 2025, at 11:19 AM, Štefan Miklošovič 
> wrote:
> >>
> >> 
> >> I do not think we need this CEP, honestly. I don't want to diss this
> unnecessarily but if you mount a remote storage locally (e.g. mounting s3
> bucket as if it was any other directory on node's machine), then what is
> this CEP good for?
> >>
> >> Not talking about the necessity to put all dependencies to be able to
> talk to respective remote storage to Cassandra's class path, introducing
> potential problems with dependencies and their possible incompatibilities /
> different versions etc ...
> >>
> >> On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas 
> wrote:
> >>>
> >>> I’d love to see this implemented — where “this” is a proxy for some
> notion of support for remote object storage, perhaps usable by compaction
> strategies like TWCS to migrate data older than a threshold from a local
> filesystem to remote object.
> >>>
> >>> It’s not an area where I can currently dedicate engineering effort.
> But if others are interested in contributing a feature like this, I’d see
> it as valuable for the project and would be happy to collaborate on
> design/architecture/goals.
> >>>
> >>> – Scott
> >>>
> >>> On Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:
> >>>
> >>> 
> >>> Is anyone else interested in continuing to discuss this topic?
> >>>
> >>> guo Maxwell  于2024年9月20日周五 09:44写道：
> 
>  I discussed this offline with Claude, he is no longer working on this.
> 
>  It's a pity. I think this is a very valuable thing. Commitlog's
> archiving and restore may be able to use the relevant code if it is
> completed.
> 
>  Patrick McFadin 于2024年9月20日 周五上午2:01写道：
> >
> > Thanks for reviving this one!
> >
> > On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell 
> wrote:
> >>
> >> Is there any update on this topic?  It seems that things can make a
> big progress if  Jake Luciani  can find someone who can make the
> FileSystemProvider code accessible.
> >>
> >> Jon Haddad  于2023年12月16日周六 05:29写道：
> >>>
> >>> At a high level I really like the idea of being able to better
> leverage cheaper storage especially object stores like S3.
> >>>
> >>> One important thing though - I feel pretty strongly that there's a
> big, deal breaking downside.   Backups, disk failure policies, snapshots
> and possibly repairs would get more complicated which haven't been
> particularly great in the past, and of course there's the issue of failure
> recovery being only partially possible if you're looking at a durable block
> store paired with an ephemeral one with some of your data not replicated to
> the cold side.  That introduces a failure case that's unacceptable for

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2025-03-04 Thread Jeff Jirsa

Most obviously, you don’t need to move all components of the sstable to s3, you could keep index + compression offsets locally. On Mar 4, 2025, at 1:46 PM, Štefan Miklošovič  wrote:I don't say that using remote object storage is useless. I am just saying that I don't see the difference. I have not measured that but I can imagine that s3 mounted would use, under the hood, the same calls to s3 api. How else would it be done? You need to talk to remote s3 storage eventually anyway. So why does it matter if we call s3 api from Java or by other means from some "s3 driver"?  It is eventually using same thing, no?On Tue, Mar 4, 2025 at 12:47 PM Jeff Jirsa  wrote:Mounting an s3 bucket as a directory is an easy but poor implementation of object backed storage for databases Object storage is durable (most data loss is due to bugs not concurrent hardware failures), cheap (can 5-10x cheaper) and ubiquitous. A  huge number of modern systems are object-storage-only because the approximately infinite scale / cost / throughput tradeoffs often make up for the latency.Outright dismissing object storage for Cassandra is short sighted - it needs to be done in a way that makes sense, not just blindly copying over the block access patterns to object.On Mar 4, 2025, at 11:19 AM, Štefan Miklošovič  wrote:I do not think we need this CEP, honestly. I don't want to diss this unnecessarily but if you mount a remote storage locally (e.g. mounting s3 bucket as if it was any other directory on node's machine), then what is this CEP good for? Not talking about the necessity to put all dependencies to be able to talk to respective remote storage to Cassandra's class path, introducing potential problems with dependencies and their possible incompatibilities / different versions etc ... On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas  wrote:I’d love to see this implemented — where “this” is a proxy for some notion of support for remote object storage, perhaps usable by compaction strategies like TWCS to migrate data older than a threshold from a local filesystem to remote object.It’s not an area where I can currently dedicate engineering effort. But if others are interested in contributing a feature like this, I’d see it as valuable for the project and would be happy to collaborate on design/architecture/goals.– ScottOn Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:Is anyone else interested in continuing to discuss this topic?guo Maxwell  于2024年9月20日周五 09:44写道：I discussed this offline with Claude, he is no longer working on this. It's a pity. I think this is a very valuable thing. Commitlog's archiving and restore may be able to use the relevant code if it is completed.Patrick McFadin 于2024年9月20日 周五上午2:01写道：Thanks for reviving this one!On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell  wrote:Is there any update on this topic?  It seems that things can make a big progress if  Jake Luciani  can find someone who can make the FileSystemProvider code accessible. Jon Haddad  于2023年12月16日周六 05:29写道：At a high level I really like the idea of being able to better leverage cheaper storage especially object stores like S3.  One important thing though - I feel pretty strongly that there's a big, deal breaking downside.   Backups, disk failure policies, snapshots and possibly repairs would get more complicated which haven't been particularly great in the past, and of course there's the issue of failure recovery being only partially possible if you're looking at a durable block store paired with an ephemeral one with some of your data not replicated to the cold side.  That introduces a failure case that's unacceptable for most teams, which results in needing to implement potentially 2 different backup solutions.  This is operationally complex with a lot of surface area for headaches.  I think a lot of teams would probably have an issue with the big question mark around durability and I probably would avoid it myself.On the other hand, I'm +1 if we approach it something slightly differently - where _all_ the data is located on the cold storage, with the local hot storage used as a cache.  This means we can use the cold directories for the complete dataset, simplifying backups and node replacements.  For a little background, we had a ticket several years ago where I pointed out it was possible to do this *today* at the operating system level as long as you're using block devices (vs an object store) and LVM [1].  For example, this works well with GP3 EBS w/ low IOPS provisioning + local NVMe to get a nice balance of great read performance without going nuts on the cost for IOPS.  I also wrote about this in a little more detail in my blog [2].  There's also the new mount point tech in AWS which pretty much does exactly what I've suggested above [3] that's probably worth evaluating just to get

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2025-03-04 Thread guo Maxwell

If we want to do this, we should wrap the object storage downwards and
provide the file system api capabilities upwards (Cassandra layer)，if my
understanding is correct.


Brandon Williams 于2025年3月4日 周二下午9:55写道：

> A failing remote api that you are calling and a failing filesystem you
> are using have different implications.
>
> Kind Regards,
> Brandon
>
> On Tue, Mar 4, 2025 at 7:47 AM Štefan Miklošovič 
> wrote:
> >
> > I don't say that using remote object storage is useless.
> >
> > I am just saying that I don't see the difference. I have not measured
> that but I can imagine that s3 mounted would use, under the hood, the same
> calls to s3 api. How else would it be done? You need to talk to remote s3
> storage eventually anyway. So why does it matter if we call s3 api from
> Java or by other means from some "s3 driver"?  It is eventually using same
> thing, no?
> >
> > On Tue, Mar 4, 2025 at 12:47 PM Jeff Jirsa  wrote:
> >>
> >> Mounting an s3 bucket as a directory is an easy but poor implementation
> of object backed storage for databases
> >>
> >> Object storage is durable (most data loss is due to bugs not concurrent
> hardware failures), cheap (can 5-10x cheaper) and ubiquitous. A  huge
> number of modern systems are object-storage-only because the approximately
> infinite scale / cost / throughput tradeoffs often make up for the latency.
> >>
> >> Outright dismissing object storage for Cassandra is short sighted - it
> needs to be done in a way that makes sense, not just blindly copying over
> the block access patterns to object.
> >>
> >>
> >> On Mar 4, 2025, at 11:19 AM, Štefan Miklošovič 
> wrote:
> >>
> >> 
> >> I do not think we need this CEP, honestly. I don't want to diss this
> unnecessarily but if you mount a remote storage locally (e.g. mounting s3
> bucket as if it was any other directory on node's machine), then what is
> this CEP good for?
> >>
> >> Not talking about the necessity to put all dependencies to be able to
> talk to respective remote storage to Cassandra's class path, introducing
> potential problems with dependencies and their possible incompatibilities /
> different versions etc ...
> >>
> >> On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas 
> wrote:
> >>>
> >>> I’d love to see this implemented — where “this” is a proxy for some
> notion of support for remote object storage, perhaps usable by compaction
> strategies like TWCS to migrate data older than a threshold from a local
> filesystem to remote object.
> >>>
> >>> It’s not an area where I can currently dedicate engineering effort.
> But if others are interested in contributing a feature like this, I’d see
> it as valuable for the project and would be happy to collaborate on
> design/architecture/goals.
> >>>
> >>> – Scott
> >>>
> >>> On Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:
> >>>
> >>> 
> >>> Is anyone else interested in continuing to discuss this topic?
> >>>
> >>> guo Maxwell  于2024年9月20日周五 09:44写道：
> 
>  I discussed this offline with Claude, he is no longer working on this.
> 
>  It's a pity. I think this is a very valuable thing. Commitlog's
> archiving and restore may be able to use the relevant code if it is
> completed.
> 
>  Patrick McFadin 于2024年9月20日 周五上午2:01写道：
> >
> > Thanks for reviving this one!
> >
> > On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell 
> wrote:
> >>
> >> Is there any update on this topic?  It seems that things can make a
> big progress if  Jake Luciani  can find someone who can make the
> FileSystemProvider code accessible.
> >>
> >> Jon Haddad  于2023年12月16日周六 05:29写道：
> >>>
> >>> At a high level I really like the idea of being able to better
> leverage cheaper storage especially object stores like S3.
> >>>
> >>> One important thing though - I feel pretty strongly that there's a
> big, deal breaking downside.   Backups, disk failure policies, snapshots
> and possibly repairs would get more complicated which haven't been
> particularly great in the past, and of course there's the issue of failure
> recovery being only partially possible if you're looking at a durable block
> store paired with an ephemeral one with some of your data not replicated to
> the cold side.  That introduces a failure case that's unacceptable for most
> teams, which results in needing to implement potentially 2 different backup
> solutions.  This is operationally complex with a lot of surface area for
> headaches.  I think a lot of teams would probably have an issue with the
> big question mark around durability and I probably would avoid it myself.
> >>>
> >>> On the other hand, I'm +1 if we approach it something slightly
> differently - where _all_ the data is located on the cold storage, with the
> local hot storage used as a cache.  This means we can use the cold
> directories for the complete dataset, simplifying backups and node
> replacements.
> >>>
> >>> For a little background, we had a ticket several years a

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

For what it's worth, as it might come to somebody I am rejecting this
altogether (which is not the case, all I am trying to say is that we should
just think about it more) - it would be cool to know more about the
experience of others when it comes to this, maybe somebody already tried to
mount and it did not work as expected?

On the other hand, there is this "snapshots outside data dir" effort I am
doing and if we did it with this, then I can imagine that we could say "and
if you deal with snapshots, use this proxy instead" which would
transparently upload it to s3.

Then we would not need to do anything at all, code-wise. We would not need
to store snapshots "outside of data dir" just to be able to place it on a
directory which is mounted as an s3 bucket.

I don't know if it is possible to do it like that. Worth to explore I guess.

I like mounted dirs for its simplicity and I guess that for copying files
it might be just enough. Plus we would not need to add all s3 jars on CP
either etc ...

On Tue, Mar 4, 2025 at 2:46 PM Štefan Miklošovič 
wrote:

> I don't say that using remote object storage is useless.
>
> I am just saying that I don't see the difference. I have not measured that
> but I can imagine that s3 mounted would use, under the hood, the same calls
> to s3 api. How else would it be done? You need to talk to remote s3 storage
> eventually anyway. So why does it matter if we call s3 api from Java or by
> other means from some "s3 driver"?  It is eventually using same thing, no?
>
> On Tue, Mar 4, 2025 at 12:47 PM Jeff Jirsa  wrote:
>
>> Mounting an s3 bucket as a directory is an easy but poor implementation
>> of object backed storage for databases
>>
>> Object storage is durable (most data loss is due to bugs not concurrent
>> hardware failures), cheap (can 5-10x cheaper) and ubiquitous. A  huge
>> number of modern systems are object-storage-only because the approximately
>> infinite scale / cost / throughput tradeoffs often make up for the latency.
>>
>> Outright dismissing object storage for Cassandra is short sighted - it
>> needs to be done in a way that makes sense, not just blindly copying over
>> the block access patterns to object.
>>
>>
>> On Mar 4, 2025, at 11:19 AM, Štefan Miklošovič 
>> wrote:
>>
>> 
>> I do not think we need this CEP, honestly. I don't want to diss this
>> unnecessarily but if you mount a remote storage locally (e.g. mounting s3
>> bucket as if it was any other directory on node's machine), then what is
>> this CEP good for?
>>
>> Not talking about the necessity to put all dependencies to be able to
>> talk to respective remote storage to Cassandra's class path, introducing
>> potential problems with dependencies and their possible incompatibilities /
>> different versions etc ...
>>
>> On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas 
>> wrote:
>>
>>> I’d love to see this implemented — where “this” is a proxy for some
>>> notion of support for remote object storage, perhaps usable by compaction
>>> strategies like TWCS to migrate data older than a threshold from a local
>>> filesystem to remote object.
>>>
>>> It’s not an area where I can currently dedicate engineering effort. But
>>> if others are interested in contributing a feature like this, I’d see it as
>>> valuable for the project and would be happy to collaborate on
>>> design/architecture/goals.
>>>
>>> – Scott
>>>
>>> On Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:
>>>
>>> 
>>> Is anyone else interested in continuing to discuss this topic?
>>>
>>> guo Maxwell  于2024年9月20日周五 09:44写道：
>>>
 I discussed this offline with Claude, he is no longer working on this.

 It's a pity. I think this is a very valuable thing. Commitlog's
 archiving and restore may be able to use the relevant code if it is
 completed.

 Patrick McFadin 于2024年9月20日 周五上午2:01写道：

> Thanks for reviving this one!
>
> On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell 
> wrote:
>
>> Is there any update on this topic?  It seems that things can make a
>> big progress if  Jake Luciani  can find someone who can make the
>> FileSystemProvider code accessible.
>>
>> Jon Haddad  于2023年12月16日周六 05:29写道：
>>
>>> At a high level I really like the idea of being able to better
>>> leverage cheaper storage especially object stores like S3.
>>>
>>> One important thing though - I feel pretty strongly that there's a
>>> big, deal breaking downside.   Backups, disk failure policies, snapshots
>>> and possibly repairs would get more complicated which haven't been
>>> particularly great in the past, and of course there's the issue of 
>>> failure
>>> recovery being only partially possible if you're looking at a durable 
>>> block
>>> store paired with an ephemeral one with some of your data not 
>>> replicated to
>>> the cold side.  That introduces a failure case that's unacceptable for 
>>> most
>>> teams, which results in need

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2025-03-04 Thread Jeff Jirsa

Mounted dirs give up the opportunity to change the IO model to account for different behaviors. The configurable channel proxy may suffer from the same IO constraints depending on implementation, too. But it may also become viable. The snapshot outside of the mounted file system seems like you’re implicitly implementing a one off in process backup. This is the same feedback I gave the sidecar with the rsync to another machine proposal. If we stop doing one off tactical projects and map out the actual problem we’re solving, we can get the right interfaces.  Also, you can probably just have the sidecar rsync thing do your snapshot to another directory on host.  But if every sstable makes its way to s3, things like native backup, restoring from backup, recovering from local volumes can look VERY differentOn Mar 4, 2025, at 3:57 PM, Štefan Miklošovič  wrote:For what it's worth, as it might come to somebody I am rejecting this altogether (which is not the case, all I am trying to say is that we should just think about it more) - it would be cool to know more about the experience of others when it comes to this, maybe somebody already tried to mount and it did not work as expected?On the other hand, there is this "snapshots outside data dir" effort I am doing and if we did it with this, then I can imagine that we could say "and if you deal with snapshots, use this proxy instead" which would transparently upload it to s3.Then we would not need to do anything at all, code-wise. We would not need to store snapshots "outside of data dir" just to be able to place it on a directory which is mounted as an s3 bucket. I don't know if it is possible to do it like that. Worth to explore I guess.I like mounted dirs for its simplicity and I guess that for copying files it might be just enough. Plus we would not need to add all s3 jars on CP either etc ...On Tue, Mar 4, 2025 at 2:46 PM Štefan Miklošovič  wrote:I don't say that using remote object storage is useless. I am just saying that I don't see the difference. I have not measured that but I can imagine that s3 mounted would use, under the hood, the same calls to s3 api. How else would it be done? You need to talk to remote s3 storage eventually anyway. So why does it matter if we call s3 api from Java or by other means from some "s3 driver"?  It is eventually using same thing, no?On Tue, Mar 4, 2025 at 12:47 PM Jeff Jirsa  wrote:Mounting an s3 bucket as a directory is an easy but poor implementation of object backed storage for databases Object storage is durable (most data loss is due to bugs not concurrent hardware failures), cheap (can 5-10x cheaper) and ubiquitous. A  huge number of modern systems are object-storage-only because the approximately infinite scale / cost / throughput tradeoffs often make up for the latency.Outright dismissing object storage for Cassandra is short sighted - it needs to be done in a way that makes sense, not just blindly copying over the block access patterns to object.On Mar 4, 2025, at 11:19 AM, Štefan Miklošovič  wrote:I do not think we need this CEP, honestly. I don't want to diss this unnecessarily but if you mount a remote storage locally (e.g. mounting s3 bucket as if it was any other directory on node's machine), then what is this CEP good for? Not talking about the necessity to put all dependencies to be able to talk to respective remote storage to Cassandra's class path, introducing potential problems with dependencies and their possible incompatibilities / different versions etc ... On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas  wrote:I’d love to see this implemented — where “this” is a proxy for some notion of support for remote object storage, perhaps usable by compaction strategies like TWCS to migrate data older than a threshold from a local filesystem to remote object.It’s not an area where I can currently dedicate engineering effort. But if others are interested in contributing a feature like this, I’d see it as valuable for the project and would be happy to collaborate on design/architecture/goals.– ScottOn Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:Is anyone else interested in continuing to discuss this topic?guo Maxwell  于2024年9月20日周五 09:44写道：I discussed this offline with Claude, he is no longer working on this. It's a pity. I think this is a very valuable thing. Commitlog's archiving and restore may be able to use the relevant code if it is completed.Patrick McFadin 于2024年9月20日 周五上午2:01写道：Thanks for reviving this one!On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell  wrote:Is there any update on this topic?  It seems that things can make a big progress if  Jake Luciani  can find someone who can make the FileSystemProvider code accessible. Jon Haddad  于2023年12月16日周六 05:29写道：At a high level I really like the idea of being abl

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Arjun Ashok

Congratulations Bernardo !!

On Mon, Mar 3, 2025 at 11:31 PM Štefan Miklošovič 
wrote:

> The Project Management Committee (PMC) for Apache Cassandra has invited
> Bernardo Botella to become a committer and we are pleased to announce that
> he has accepted.
>
> Please join us in welcoming Bernardo Botella to his new role and
> responsibility in our project community.
>
> Stefan Miklosovic
>
> On behalf of the Apache Cassandra PMC
>


-- 
Regards,
Arjun Ashok

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Jordan West

Congratulations!!

On Tue, Mar 4, 2025 at 10:16 Arjun Ashok  wrote:

> Congratulations Bernardo !!
>
> On Mon, Mar 3, 2025 at 11:31 PM Štefan Miklošovič 
> wrote:
>
>> The Project Management Committee (PMC) for Apache Cassandra has invited
>> Bernardo Botella to become a committer and we are pleased to announce that
>> he has accepted.
>>
>> Please join us in welcoming Bernardo Botella to his new role and
>> responsibility in our project community.
>>
>> Stefan Miklosovic
>>
>> On behalf of the Apache Cassandra PMC
>>
>
>
> --
> Regards,
> Arjun Ashok
>

Re: Welcome Bernardo Botella as Cassandra Committer

Welcome!

On Tue, Mar 4, 2025 at 10:25 AM Jordan West  wrote:

> Congratulations!!
>
> On Tue, Mar 4, 2025 at 10:16 Arjun Ashok  wrote:
>
>> Congratulations Bernardo !!
>>
>> On Mon, Mar 3, 2025 at 11:31 PM Štefan Miklošovič 
>> wrote:
>>
>>> The Project Management Committee (PMC) for Apache Cassandra has invited
>>> Bernardo Botella to become a committer and we are pleased to announce that
>>> he has accepted.
>>>
>>> Please join us in welcoming Bernardo Botella to his new role and
>>> responsibility in our project community.
>>>
>>> Stefan Miklosovic
>>>
>>> On behalf of the Apache Cassandra PMC
>>>
>>
>>
>> --
>> Regards,
>> Arjun Ashok
>>
>

Re: Welcome Aaron Ploetz as Cassandra Committer

Congrats Aaron!

On Tue, Mar 4, 2025 at 10:26 AM Jordan West  wrote:

> Congratulations!!
> On Tue, Mar 4, 2025 at 09:57 Tolbert, Andy  wrote:
>
>> Congrats Aaron!
>>
>> On Tue, Mar 4, 2025 at 11:24 AM Francisco Guerrero 
>> wrote:
>>
>>> Congratulations Aaron!
>>>
>>> On 2025/03/04 00:23:49 Patrick McFadin wrote:
>>> > The Apache Cassandra PMC is very happy to announce that Aaron Ploetz
>>> has
>>> > accepted the invitation to become a committer!
>>> >
>>> > Aaron has been tireless in his mission to help every single Cassandra
>>> > operator on planet Earth. If you don't believe me, check out his Stack
>>> > Overflow profile page: https://stackoverflow.com/users/1054558/aaron
>>> > He's been a continuous speaker on Cassandra topics and is one of the
>>> > coordinators for the Planet Cassandra meetup. Those are just the
>>> > recent highlights.
>>> >
>>> > Please join us in congratulating and welcoming Aaron.
>>> >
>>> > The Apache Cassandra PMC members
>>> >
>>>
>>

[DISCUSS] AWS IAM-based client authentication

2025-03-04 Thread Joel Shepherd

Hi - I have a side project that provides client- and node-side Java 
plug-ins to enable client-to-node authentication based on AWS 
identities. This would, for example, enable clients to use EC2 instance 
roles to authenticate to Cassandra nodes, or use ordinary IAM 
keys/secret keys. The client needs to be able to obtain valid IAM 
credentials to sign a request, and the node needs to be able to connect 
to a public AWS Security Token Service (STS) endpoint. There are no 
other required AWS dependencies, and (I believe) no changes required 
driver or node code: just minor configuration updates.


I'm seeking help in reviewing the concept and code. I'm new to this 
community,  so I'm looking for suggestions on how to best engage you on 
this.


The code (which is not quite production-ready) is in two private GitHub 
repos which I'm happy to grant access to for early review. I can also 
provide documentation on the approach: not sure whether that's best 
shared via this thread, a CEP, repo documentation ... suggestions wanted.


Thanks: I'd appreciate any and all help in making these plug-ins 
available to the community.


-- Joel.

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread guo Maxwell

Congratulations Aaron!

Aaron 于2025年3月4日 周二下午10:06写道：

> Thank you everyone!
>
> Aaron
>
>
> On Tue, Mar 4, 2025 at 7:02 AM J. D. Jordan 
> wrote:
>
>> 🎉
>>
>> On Mar 4, 2025, at 5:49 AM, Ekaterina Dimitrova 
>> wrote:
>>
>> 
>> Congrats!!! 🎉
>>
>> On Tue, 4 Mar 2025 at 6:11, Josh McKenzie  wrote:
>>
>>> Congrats Aaron!
>>>
>>> On Tue, Mar 4, 2025, at 4:08 AM, Soheil Rahsaz wrote:
>>>
>>> Congratulations Aaron!
>>>
>>> On Tue, Mar 4, 2025 at 12:09 PM Paulo Motta  wrote:
>>>
>>> Congratulations Aaron, happy to see you recognized as a committer!
>>>
>>> Cheers,
>>>
>>> Paulo
>>>
>>> On Tue, 4 Mar 2025 at 03:26 Bernardo Botella <
>>> conta...@bernardobotella.com> wrote:
>>>
>>> That’s awesome!!
>>>
>>> Congratulations Aaron!! Long overdue for sure!
>>>
>>>
>>> On Mon, Mar 3, 2025 at 16:25 Patrick McFadin  wrote:
>>>
>>> The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
>>> accepted the invitation to become a committer!
>>>
>>> Aaron has been tireless in his mission to help every single Cassandra
>>> operator on planet Earth. If you don't believe me, check out his Stack
>>> Overflow profile page: https://stackoverflow.com/users/1054558/aaron
>>> He's been a continuous speaker on Cassandra topics and is one of the
>>> coordinators for the Planet Cassandra meetup. Those are just the
>>> recent highlights.
>>>
>>> Please join us in congratulating and welcoming Aaron.
>>>
>>> The Apache Cassandra PMC members
>>>
>>>

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Tolbert, Andy

Congrats Aaron!

On Tue, Mar 4, 2025 at 11:24 AM Francisco Guerrero 
wrote:

> Congratulations Aaron!
>
> On 2025/03/04 00:23:49 Patrick McFadin wrote:
> > The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
> > accepted the invitation to become a committer!
> >
> > Aaron has been tireless in his mission to help every single Cassandra
> > operator on planet Earth. If you don't believe me, check out his Stack
> > Overflow profile page: https://stackoverflow.com/users/1054558/aaron
> > He's been a continuous speaker on Cassandra topics and is one of the
> > coordinators for the Planet Cassandra meetup. Those are just the
> > recent highlights.
> >
> > Please join us in congratulating and welcoming Aaron.
> >
> > The Apache Cassandra PMC members
> >
>

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Yifan Cai

Congrats Bernardo!

From: Tolbert, Andy 
Sent: Tuesday, March 4, 2025 9:27:59 AM
To: dev@cassandra.apache.org 
Subject: Re: Welcome Bernardo Botella as Cassandra Committer

Congrats Bernardo!!

On Tue, Mar 4, 2025 at 11:25 AM Francisco Guerrero 
mailto:fran...@apache.org>> wrote:
Congratulations Bernardo! Well deserved.

On 2025/03/04 07:30:06 Štefan Miklošovič wrote:
> The Project Management Committee (PMC) for Apache Cassandra has invited
> Bernardo Botella to become a committer and we are pleased to announce that
> he has accepted.
>
> Please join us in welcoming Bernardo Botella to his new role and
> responsibility in our project community.
>
> Stefan Miklosovic
>
> On behalf of the Apache Cassandra PMC
>

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Francisco Guerrero

Congratulations Bernardo! Well deserved.

On 2025/03/04 07:30:06 Štefan Miklošovič wrote:
> The Project Management Committee (PMC) for Apache Cassandra has invited
> Bernardo Botella to become a committer and we are pleased to announce that
> he has accepted.
> 
> Please join us in welcoming Bernardo Botella to his new role and
> responsibility in our project community.
> 
> Stefan Miklosovic
> 
> On behalf of the Apache Cassandra PMC
>

CEP-15 Update

2025-03-04 Thread Benedict Elliott Smith

Hi everyone,

It’s been exactly 3.5 years since the first commit to cassandra-accord. Yes,
really, it’s been that long.

We will be starting to validate the feature against real workloads in the near
future, so we can’t sensibly push off merging much longer. The following is a
brief run-down of the state of play. There are no known bugs, but there remain
a number of caveats we will be incrementally addressing in the run-up to a full
release:

[1] Accord is likely to be SLOW until further optimisations are implemented
[2] Schema changes have a number of hard edges
[3] Validation is ongoing, so there are likely still a number of bugs to shake
out
[4] Many operator visibility/tooling/documentation improvements are pending

To expand a little:

[1] As of the last experiment we conducted, accord’s throughput was poor - also
leading to higher LAN latencies. We have done no WAN experiments to date, but
the protocol guarantees should already achieve better round-trip performance,
in particular under contention. Improving throughput will be the main focus of
attention once we are satisfied the protocol is otherwise stable, but our focus
remains validation for the moment.
[2] Schema changes have not yet been well integrated with TCM. Dropping a table
for instance will currently cause problems if nodes are offline.
[3] We have a range of validations we are already performing against
cassandra-accord directly, and against its integration with Cassandra in
cep-15-accord. We have run hundreds of billions of simulated transactions, and
are still discovering some minor fault every few billion simulated transactions
or so. There remains a lot more simulated validation to explore, as well as
with real clusters serving real workloads.
[4] There are already a range of virtual tables for exploring internal state in
Accord, and reasonably good metric support. However, tracing is not yet
supported, and our metric and virtual table integrations need some further
development.
[5] There are also other edge cases to address such as ensuring we do not reuse
HLCs after restart, supporting ByteOrderPartitioner, and live migration from/to
Paxos is undergoing fine-tuning and validation; probably there are some other
things I am forgetting.

Altogether the feature is fairly mature, despite these caveats. This is the
fruit of the labour of a long list of contributors, including Aleksey
Yeschenko, Alex Petrov, Ariel Weisberg, Blake Eggleston, Caleb Rackliffe and
David Capwell, and represents a huge undertaking. It also wouldn’t have been
possible without the work of Alex Petrov, Marcus Eriksson and Sam Tunnicliffe
on delivering transactional cluster metadata. I hope you will join me in
thanking them all for their contributions.

Alex has also kindly produced some initial overview documentation for
developers, that can be found here:
https://github.com/apache/cassandra/blob/cep-15-accord/doc/modules/cassandra/pages/developing/accord/index.adoc.
This will be expanded as time permits.

Does anyone have any questions or concerns?

Re: CEP-15 Update

Very exciting!

I have a client that's very interested in Accord, so I should have budget
to dig into it, especially on the performance side of things.

Jon

On Tue, Mar 4, 2025 at 9:57 AM Dmitry Konstantinov 
wrote:

> Thank you to all Accord and TCM contributors, it is really exciting to see
> a development of such huge and wonderful features moving forward and
> opening the door to the new Cassandra epoch!
>
> On Tue, 4 Mar 2025 at 20:45, Blake Eggleston  wrote:
>
>> Thanks Benedict!
>>
>> I’m really excited to see accord reach this milestone, even with these
>> caveats. You seem to have left yourself off the list of contributors
>> though, even though you’ve been a central figure in its development :) So
>> thanks to all accord & tcm contributors, including Benedict, for making
>> this possible!
>>
>> On Tue, Mar 4, 2025, at 8:00 AM, Benedict Elliott Smith wrote:
>>
>> Hi everyone,
>>
>> It’s been exactly 3.5 years since the first commit to cassandra-accord.
>> Yes, really, it’s been that long.
>>
>> We will be starting to validate the feature against real workloads in the
>> near future, so we can’t sensibly push off merging much longer. The
>> following is a brief run-down of the state of play. There are no known
>> bugs, but there remain a number of caveats we will be incrementally
>> addressing in the run-up to a full release:
>>
>> [1] Accord is likely to be SLOW until further optimisations are
>> implemented
>> [2] Schema changes have a number of hard edges
>> [3] Validation is ongoing, so there are likely still a number of bugs to
>> shake out
>> [4] Many operator visibility/tooling/documentation improvements are
>> pending
>>
>> To expand a little:
>>
>> [1] As of the last experiment we conducted, accord’s throughput was poor
>> - also leading to higher LAN latencies. We have done no WAN experiments to
>> date, but the protocol guarantees should already achieve better round-trip
>> performance, in particular under contention. Improving throughput will be
>> the main focus of attention once we are satisfied the protocol is otherwise
>> stable, but our focus remains validation for the moment.
>> [2] Schema changes have not yet been well integrated with TCM. Dropping a
>> table for instance will currently cause problems if nodes are offline.
>> [3] We have a range of validations we are already performing against
>> cassandra-accord directly, and against its integration with Cassandra in
>> cep-15-accord. We have run hundreds of billions of simulated transactions,
>> and are still discovering some minor fault every few billion simulated
>> transactions or so. There remains a lot more simulated validation to
>> explore, as well as with real clusters serving real workloads.
>> [4] There are already a range of virtual tables for exploring internal
>> state in Accord, and reasonably good metric support. However, tracing is
>> not yet supported, and our metric and virtual table integrations need some
>> further development.
>> [5] There are also other edge cases to address such as ensuring we do not
>> reuse HLCs after restart, supporting ByteOrderPartitioner, and live
>> migration from/to Paxos is undergoing fine-tuning and validation; probably
>> there are some other things I am forgetting.
>>
>> Altogether the feature is fairly mature, despite these caveats. This is
>> the fruit of the labour of a long list of contributors, including Aleksey
>> Yeschenko, Alex Petrov, Ariel Weisberg, Blake Eggleston, Caleb Rackliffe
>> and David Capwell, and represents a huge undertaking. It also wouldn’t have
>> been possible without the work of Alex Petrov, Marcus Eriksson and Sam
>> Tunnicliffe on delivering transactional cluster metadata. I hope you will
>> join me in thanking them all for their contributions.
>>
>> Alex has also kindly produced some initial overview documentation for
>> developers, that can be found here:
>> https://github.com/apache/cassandra/blob/cep-15-accord/doc/modules/cassandra/pages/developing/accord/index.adoc.
>> This will be expanded as time permits.
>>
>> Does anyone have any questions or concerns?
>>
>>
>>
>
> --
> Dmitry Konstantinov
>

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Aaron

Welcome Ekaterina! Congratulations!!!

On Tue, Mar 4, 2025 at 2:50 PM Yifan Cai  wrote:

> Congratulations!
> --
> *From:* Dmitry Konstantinov 
> *Sent:* Tuesday, March 4, 2025 12:40:48 PM
> *To:* dev@cassandra.apache.org 
> *Subject:* Re: Welcome Ekaterina Dimitrova as Cassandra PMC member
>
> Congrats Ekaterina!!
>
> On Tue, 4 Mar 2025 at 23:25, Paulo Motta  wrote:
>
> Aloha,
>
> The Project Management Committee (PMC) for Apache Cassandra is delighted
> to announce that Ekaterina Dimitrova has joined the PMC!
>
> Thanks a lot, Ekaterina, for everything you have done for the project all
> these years.
>
> The PMC - Project Management Committee - manages and guides the direction
> of the project, and is responsible for inviting new committers and PMC
> members to steward the longevity of the project.
>
> See https://community.apache.org/pmc/responsibilities.html if you're
> interested in learning more about the rights and responsibilities of PMC
> members.
>
> Please join us in welcoming Ekaterina Dimitrova to her new role in our
> project!
>
> Paulo, on behalf of the Apache Cassandra PMC
>
>
>
> --
> Dmitry Konstantinov
>

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Yifan Cai

Congratulations!

From: Dmitry Konstantinov 
Sent: Tuesday, March 4, 2025 12:40:48 PM
To: dev@cassandra.apache.org 
Subject: Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

Congrats Ekaterina!!

On Tue, 4 Mar 2025 at 23:25, Paulo Motta 
mailto:pa...@apache.org>> wrote:
Aloha,

The Project Management Committee (PMC) for Apache Cassandra is delighted to 
announce that Ekaterina Dimitrova has joined the PMC!

Thanks a lot, Ekaterina, for everything you have done for the project all these 
years.

The PMC - Project Management Committee - manages and guides the direction of 
the project, and is responsible for inviting new committers and PMC members to 
steward the longevity of the project.

See https://community.apache.org/pmc/responsibilities.html if you're interested 
in learning more about the rights and responsibilities of PMC members.

Please join us in welcoming Ekaterina Dimitrova to her new role in our project!

Paulo, on behalf of the Apache Cassandra PMC

--
Dmitry Konstantinov

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Yifan Cai

Congrats!

From: Rahul Singh (ANANT) 
Sent: Tuesday, March 4, 2025 9:37:21 AM
To: dev@cassandra.apache.org 
Subject: Re: Welcome Aaron Ploetz as Cassandra Committer

Wooot woot Congrats Aaron

Sent via Superhuman

On Tue, Mar 04, 2025 at 12:24 PM, Francisco Guerrero 
mailto:fran...@apache.org>> wrote:

Congratulations Aaron!

On 2025/03/04 00:23:49 Patrick McFadin wrote:

The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has 
accepted the invitation to become a committer!

Aaron has been tireless in his mission to help every single Cassandra operator 
on planet Earth. If you don't believe me, check out his Stack Overflow profile 
page: https://stackoverflow.com/users/1054558/aaron He's been a continuous 
speaker on Cassandra topics and is one of the coordinators for the Planet 
Cassandra meetup. Those are just the recent highlights.

Please join us in congratulating and welcoming Aaron.

The Apache Cassandra PMC members

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Aaron

Congratulations Bernardo! Glad to see this happen!


On Tue, Mar 4, 2025 at 7:12 AM J. D. Jordan 
wrote:

> Congrats!
>
> On Mar 4, 2025, at 5:48 AM, Ekaterina Dimitrova 
> wrote:
>
> 
> Congratulations!! 🎉
>
> On Tue, 4 Mar 2025 at 6:15, Josh McKenzie  wrote:
>
>> Congrats Bernardo - it's been great collaborating with you thus far and
>> looking forward to more!
>>
>> On Tue, Mar 4, 2025, at 4:13 AM, Paulo Motta wrote:
>>
>> ¡Felicitaciones Bernardo! Well deserved!
>>
>> Cheers,
>>
>> Paulo
>>
>> On Tue, 4 Mar 2025 at 06:10 Soheil Rahsaz 
>> wrote:
>>
>> Congrats, Bernardo! Well deserved! 🎉 Looking forward to seeing your
>> continued impact on the project.
>>
>>
>>
>> On Tue, Mar 4, 2025 at 12:03 PM Dmitry Konstantinov 
>> wrote:
>>
>> Congrats Bernardo!!
>>
>> On Tue, 4 Mar 2025 at 10:58, Berenguer Blasi 
>> wrote:
>>
>> Congrats!
>>
>> On 4/3/25 8:30, Štefan Miklošovič wrote:
>> > The Project Management Committee (PMC) for Apache Cassandra has
>> > invited Bernardo Botella to become a committer and we are pleased to
>> > announce that he has accepted.
>> >
>> > Please join us in welcoming Bernardo Botella to his new role and
>> > responsibility in our project community.
>> >
>> > Stefan Miklosovic
>> >
>> > On behalf of the Apache Cassandra PMC
>>
>>
>>
>> --
>> Dmitry Konstantinov
>>
>>
>>

Re: Fix versions for CASSANDRA-19596 improve IntervalTree build throughput

2025-03-04 Thread Ariel Weisberg

Hi,

Thanks for pointing that out Yuqi. I'm going to work on merging 20158 and 20164 
next. 

Since there are no objections I am going to assume lazy consensus and merge 
back to 5.0.

Ariel

On Thu, Feb 27, 2025, at 2:44 PM, Yuqi Yan wrote:
> Thanks Ariel for bringing this to the dev mail list. 
> I want to add few more notes for 20158 and 20164:
> 
> > The only problem with this approach is that the tree is not rebalanced
> Actually 20158 won't build an imbalanced tree - to replace an interval in the 
> tree, the before and after must have the same intervals [low, high]. 
> For such a case, even if you rebuild the tree, the new interval will just be 
> placed at the exact same node in the tree, which makes it very unnecessary to 
> rebuild the entire tree.
> 
> By default C* is running with preemptive open feature (50MB size), and with 
> default 160MB setup for LCS, 20158 can save you at least 50% time & CPU from 
> building the interval trees per compaction task (you can check the test 
> result from in the ticket)
> 
> > force a full rebuild periodically or based on some balance signal
> Agree that I think this can be achieved by some checking on last update time, 
> and fallback to rebuild method when not being rebuilt say after 5 minutes or 
> so, as safeguard.
> 
> 
> I want to point out that 20164 (or any other better idea) is still needed to 
> *prevent the memtable flushing from being choked *by this low interval tree 
> build throughput.
> I attached the test result in CASSANDRA-20159 
> , that even with 
> global improvement for interval tree build from 19596, memtable flushing can 
> still take more than minutes to complete.
> Memtable flushing is *single-threaded* that needs to compete with other view 
> updates (compactions, index summary rebuild, etc.). Not all view updates 
> require an interval tree rebuild. Supporting addInterval will make this 
> update way faster (several orders of magnitudes) than at least compaction ops 
> (checkpoint()), which is the most common type of view updates you can see 
> from a normal node. With this you'll see way lower contentions for view 
> updates from memtable flushing and prevent the node from being stuck.
> 
> Let me know if I can do something to improve the patches / make the review 
> process smoother :)
> 
> 
> On Thu, Feb 27, 2025 at 10:20 AM Jon Haddad  wrote:
>> I’ve encountered a handful of spinning platters, but not a lot. 
>> 
>> I think we should generally optimize for the common case, not the exception. 
>> 
>> 
>> On Thu, Feb 27, 2025 at 9:51 AM Josh McKenzie  wrote:
>>> __
>>> This is a significant enough performance problem *in normal operations* I'd 
>>> consider it a bug and thus eligible for back-porting. A couple other 
>>> thoughts:
>>> 
 CASSANDRA-20158 and CASSANDRA-20164 are several orders of magnitudes 
 faster ... The only problem with this approach is that the tree is not 
 rebalanced... it would be trivial to force a full rebuild periodically or 
 based on some balance signal.
>>> I'd also be strongly in support of these modifications landing in all 
>>> currently supported branches if someone has the time and energy to step up 
>>> and do the work given the combined current test coverage and robust 
>>> property-based additions to the testing in 19596.
>>> 
 Early open which is enabled by default
>>> Is it fair to assume most current C* nodes will no longer be on spinning 
>>> disk and thus we should change this default to disabled?
>>> 
>>> On Thu, Feb 27, 2025, at 12:35 PM, Ariel Weisberg wrote:
 Hi,
 
 I want to discuss what versions we should backport IntervalTree 
 improvements to specifically 19596 which I think is the lower risk option 
 because it builds the same trees as before. I think we should at least 
 backport to 5.0.
 
 IntervalTree performance has shown up as a problematic bottleneck in a 
 couple of scenarios. If a node ends up with lots of small sstables it will 
 fall over trying to build IntervalTrees because they are very slow. With 
 100k sstables which isn't unusual in the normal case for leveled 
 compaction it takes 200+ milliseconds to build one on my laptop.
 
 When this happens compactions and flushing can't complete and the mutation 
 stage backs up along with usual problems associated with a large 
 compaction backlog. Early open which is enabled by default makes this much 
 worse because it causes compaction to rebuild IntervalTree many times more 
 often. 
 
 The problem is the tree is immutable and currently the entire tree is 
 rebuilt every time it is updated. The proper fix is to create a persistent 
 tree, but no one has had the time at the moment and it's not a great 
 candidate for back porting to existing releases.
 
 CASSANDRA-19596 builds identical trees but aims to make the building 
 process significa

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Tolbert, Andy

Congrats Bernardo!!

On Tue, Mar 4, 2025 at 11:25 AM Francisco Guerrero 
wrote:

> Congratulations Bernardo! Well deserved.
>
> On 2025/03/04 07:30:06 Štefan Miklošovič wrote:
> > The Project Management Committee (PMC) for Apache Cassandra has invited
> > Bernardo Botella to become a committer and we are pleased to announce
> that
> > he has accepted.
> >
> > Please join us in welcoming Bernardo Botella to his new role and
> > responsibility in our project community.
> >
> > Stefan Miklosovic
> >
> > On behalf of the Apache Cassandra PMC
> >
>

Welcome Ekaterina Dimitrova as Cassandra PMC member

Aloha,

The Project Management Committee (PMC) for Apache Cassandra is delighted to
announce that Ekaterina Dimitrova has joined the PMC!

Thanks a lot, Ekaterina, for everything you have done for the project all
these years.

The PMC - Project Management Committee - manages and guides the direction
of the project, and is responsible for inviting new committers and PMC
members to steward the longevity of the project.

See https://community.apache.org/pmc/responsibilities.html if you're
interested in learning more about the rights and responsibilities of PMC
members.

Please join us in welcoming Ekaterina Dimitrova to her new role in our
project!

Paulo, on behalf of the Apache Cassandra PMC

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

Jeff,

when it comes to snapshots, there was already discussion in the other
thread I am not sure you are aware of (1), here (2) I am talking about
Sidecar + snapshots specifically. One "caveat" of Sidecar is that you
actually _need_ sidecar if we ever contemplated Sidecar doing upload /
backup (by whatever means).

"s3 bucket as a mounted directory" bypasses the necessity of having Sidecar
deployed. I do not want to repeat what I wrote in (2), all the reasoning is
there.

My primary motivation to do it by mounting is to 1) not use Sidecar if not
needed 2) just save snapshot SSTables outside of table data dirs (which is
imho completely fine requirement on its own, e.g. putting snapshots on a
different / slow disk etc, why does it have to be on the same disk as data
are?)

The vibe I got from that thread was that what I was proposing is in general
acceptable and I wanted to figure out the details as Blake was mentioning
incremental backs as well. Other than that, I was thinking that we are
pretty much settled on how that should be, that thread was up for a long
time so I was thinking people in general do not have a problem with that.

How I read your email, specifically this part:

"This is the same feedback I gave the sidecar with the rsync to another
machine proposal. If we stop doing one off tactical projects and map out
the actual problem we’re solving, we can get the right interfaces."

it seems to me that you are categorizing "s3 bucket mounted locally" as one
of these "one off tactical projects" which in translation means that you
would like to see that approach not implemented and we should rather focus
on doing everything via proxies?

One big downside of that which nobody answered yet (seems like that to me
as far as I checked) is how would this actually look like on deployment and
delivery? As I said in (1), we would need to code up proxy for every remote
storage, Azure, S3, GCP to name a few. We would need to implement all of
these for each and every cloud.

Secondly, who would implement that and where would that code live? Is it up
to individuals to code it up internally? When we want to talk to s3, we
need to put all s3 dependencies to classpath, who is going to integrate
that, is that even possible? Similar for other clouds.

By mounting a dir, we just do not do anything with Cassandra's class path.
It is as it was, simple, easy, and we interact with it as we are used to.

I see that proxy might be viable for some applications but I think it has
also non-trivial disadvantages operationally-wise.

(1) https://lists.apache.org/thread/8cz5fh835ojnxwtn1479q31smm5x7nxt
(2) https://lists.apache.org/thread/mttg75ps49qkob6km4l74fmp879v76qs

On Tue, Mar 4, 2025 at 5:13 PM Jeff Jirsa  wrote:

> Mounted dirs give up the opportunity to change the IO model to account for
> different behaviors. The configurable channel proxy may suffer from the
> same IO constraints depending on implementation, too. But it may also
> become viable.
>
> The snapshot outside of the mounted file system seems like you’re
> implicitly implementing a one off in process backup. This is the same
> feedback I gave the sidecar with the rsync to another machine proposal. If
> we stop doing one off tactical projects and map out the actual problem
> we’re solving, we can get the right interfaces.  Also, you can probably
> just have the sidecar rsync thing do your snapshot to another directory on
> host.
>
> But if every sstable makes its way to s3, things like native backup,
> restoring from backup, recovering from local volumes can look VERY different
>
>
>
> On Mar 4, 2025, at 3:57 PM, Štefan Miklošovič 
> wrote:
>
> 
> For what it's worth, as it might come to somebody I am rejecting this
> altogether (which is not the case, all I am trying to say is that we should
> just think about it more) - it would be cool to know more about the
> experience of others when it comes to this, maybe somebody already tried to
> mount and it did not work as expected?
>
> On the other hand, there is this "snapshots outside data dir" effort I am
> doing and if we did it with this, then I can imagine that we could say "and
> if you deal with snapshots, use this proxy instead" which would
> transparently upload it to s3.
>
> Then we would not need to do anything at all, code-wise. We would not need
> to store snapshots "outside of data dir" just to be able to place it on a
> directory which is mounted as an s3 bucket.
>
> I don't know if it is possible to do it like that. Worth to explore I
> guess.
>
> I like mounted dirs for its simplicity and I guess that for copying files
> it might be just enough. Plus we would not need to add all s3 jars on CP
> either etc ...
>
> On Tue, Mar 4, 2025 at 2:46 PM Štefan Miklošovič 
> wrote:
>
>> I don't say that using remote object storage is useless.
>>
>> I am just saying that I don't see the difference. I have not measured
>> that but I can imagine that s3 mounted would use, under the hood, the same
>> calls to s

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Jordan West

Congratulations!!
On Tue, Mar 4, 2025 at 09:57 Tolbert, Andy  wrote:

> Congrats Aaron!
>
> On Tue, Mar 4, 2025 at 11:24 AM Francisco Guerrero 
> wrote:
>
>> Congratulations Aaron!
>>
>> On 2025/03/04 00:23:49 Patrick McFadin wrote:
>> > The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
>> > accepted the invitation to become a committer!
>> >
>> > Aaron has been tireless in his mission to help every single Cassandra
>> > operator on planet Earth. If you don't believe me, check out his Stack
>> > Overflow profile page: https://stackoverflow.com/users/1054558/aaron
>> > He's been a continuous speaker on Cassandra topics and is one of the
>> > coordinators for the Planet Cassandra meetup. Those are just the
>> > recent highlights.
>> >
>> > Please join us in congratulating and welcoming Aaron.
>> >
>> > The Apache Cassandra PMC members
>> >
>>
>

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Jasonstack Zhao Yang

Congratulations Ekaterina!

On Wed, 5 Mar 2025 at 08:18, Josh McKenzie  wrote:

> Welcome Ekaterina!  \o/
>
> On Tue, Mar 4, 2025, at 7:07 PM, Francisco Guerrero wrote:
>
> Congratulations Ekaterina! Well deserved!
>
> On 2025/03/04 20:25:08 Paulo Motta wrote:
> > Aloha,
> >
> > The Project Management Committee (PMC) for Apache Cassandra is delighted
> to
> > announce that Ekaterina Dimitrova has joined the PMC!
> >
> > Thanks a lot, Ekaterina, for everything you have done for the project all
> > these years.
> >
> > The PMC - Project Management Committee - manages and guides the direction
> > of the project, and is responsible for inviting new committers and PMC
> > members to steward the longevity of the project.
> >
> > See https://community.apache.org/pmc/responsibilities.html if you're
> > interested in learning more about the rights and responsibilities of PMC
> > members.
> >
> > Please join us in welcoming Ekaterina Dimitrova to her new role in our
> > project!
> >
> > Paulo, on behalf of the Apache Cassandra PMC
> >
>
>
>

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Berenguer Blasi


Congrats Ekaterina!

On 5/3/25 2:03, Jasonstack Zhao Yang wrote:

Congratulations Ekaterina!

On Wed, 5 Mar 2025 at 08:18, Josh McKenzie  wrote:

Welcome Ekaterina!  \o/

On Tue, Mar 4, 2025, at 7:07 PM, Francisco Guerrero wrote:

Congratulations Ekaterina! Well deserved!

On 2025/03/04 20:25:08 Paulo Motta wrote:
> Aloha,
>
> The Project Management Committee (PMC) for Apache Cassandra is
delighted to
> announce that Ekaterina Dimitrova has joined the PMC!
>
> Thanks a lot, Ekaterina, for everything you have done for the
project all
> these years.
>
> The PMC - Project Management Committee - manages and guides the
direction
> of the project, and is responsible for inviting new committers
and PMC
> members to steward the longevity of the project.
>
> See https://community.apache.org/pmc/responsibilities.html if
you're
> interested in learning more about the rights and
responsibilities of PMC
> members.
>
> Please join us in welcoming Ekaterina Dimitrova to her new role
in our
> project!
>
> Paulo, on behalf of the Apache Cassandra PMC
>

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread guo Maxwell

Congratulations！
Rahul Singh (ANANT) 于2025年3月5日 周三上午5:15写道：

> Congrats Ekaterina!
>
> Sent via Superhuman iOS 
>
>
> On Tue, Mar 4, 2025 at 3:57 PM, Aaron  wrote:
>
>> Welcome Ekaterina! Congratulations!!!
>>
>> On Tue, Mar 4, 2025 at 2:50 PM Yifan Cai  wrote:
>>
>>> Congratulations!
>>> --
>>> *From:* Dmitry Konstantinov 
>>> *Sent:* Tuesday, March 4, 2025 12:40:48 PM
>>> *To:* dev@cassandra.apache.org 
>>> *Subject:* Re: Welcome Ekaterina Dimitrova as Cassandra PMC member
>>>
>>> Congrats Ekaterina!!
>>>
>>> On Tue, 4 Mar 2025 at 23:25, Paulo Motta  wrote:
>>>
>>> Aloha,
>>>
>>> The Project Management Committee (PMC) for Apache Cassandra is delighted
>>> to announce that Ekaterina Dimitrova has joined the PMC!
>>>
>>> Thanks a lot, Ekaterina, for everything you have done for the project
>>> all these years.
>>>
>>> The PMC - Project Management Committee - manages and guides the
>>> direction of the project, and is responsible for inviting new committers
>>> and PMC members to steward the longevity of the project.
>>>
>>> See https://community.apache.org/pmc/responsibilities.html if you're
>>> interested in learning more about the rights and responsibilities of PMC
>>> members.
>>>
>>> Please join us in welcoming Ekaterina Dimitrova to her new role in our
>>> project!
>>>
>>> Paulo, on behalf of the Apache Cassandra PMC
>>>
>>>
>>>
>>> --
>>> Dmitry Konstantinov
>>>
>>

Re: CEP-15 Update

2025-03-04 Thread Patrick McFadin

"Captain's log. It's been three and a half years, and we haven't seen
any land yet but saw sea birds and know land is near..."

It's taken the time that it takes, but time to merge and I think we're
all ready to pitch in however we can. This is feature for the next
generation of Cassandra users. Tally ho!

Yes. I'm excited. Thanks to everyone that got us this far! d

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Brandon Williams

Congratulations Ekaterina!

Kind Regards,
Brandon

On Tue, Mar 4, 2025 at 2:25 PM Paulo Motta  wrote:
>
> Aloha,
>
> The Project Management Committee (PMC) for Apache Cassandra is delighted to 
> announce that Ekaterina Dimitrova has joined the PMC!
>
> Thanks a lot, Ekaterina, for everything you have done for the project all 
> these years.
>
> The PMC - Project Management Committee - manages and guides the direction of 
> the project, and is responsible for inviting new committers and PMC members 
> to steward the longevity of the project.
>
> See https://community.apache.org/pmc/responsibilities.html if you're 
> interested in learning more about the rights and responsibilities of PMC 
> members.
>
> Please join us in welcoming Ekaterina Dimitrova to her new role in our 
> project!
>
> Paulo, on behalf of the Apache Cassandra PMC

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Patrick McFadin

Congratulations Ekaterina!

On Tue, Mar 4, 2025 at 1:13 PM Rahul Singh (ANANT) 
wrote:

> Congrats Ekaterina!
>
> Sent via Superhuman iOS 
>
>
> On Tue, Mar 4, 2025 at 3:57 PM, Aaron  wrote:
>
>> Welcome Ekaterina! Congratulations!!!
>>
>> On Tue, Mar 4, 2025 at 2:50 PM Yifan Cai  wrote:
>>
>>> Congratulations!
>>> --
>>> *From:* Dmitry Konstantinov 
>>> *Sent:* Tuesday, March 4, 2025 12:40:48 PM
>>> *To:* dev@cassandra.apache.org 
>>> *Subject:* Re: Welcome Ekaterina Dimitrova as Cassandra PMC member
>>>
>>> Congrats Ekaterina!!
>>>
>>> On Tue, 4 Mar 2025 at 23:25, Paulo Motta  wrote:
>>>
>>> Aloha,
>>>
>>> The Project Management Committee (PMC) for Apache Cassandra is delighted
>>> to announce that Ekaterina Dimitrova has joined the PMC!
>>>
>>> Thanks a lot, Ekaterina, for everything you have done for the project
>>> all these years.
>>>
>>> The PMC - Project Management Committee - manages and guides the
>>> direction of the project, and is responsible for inviting new committers
>>> and PMC members to steward the longevity of the project.
>>>
>>> See https://community.apache.org/pmc/responsibilities.html if you're
>>> interested in learning more about the rights and responsibilities of PMC
>>> members.
>>>
>>> Please join us in welcoming Ekaterina Dimitrova to her new role in our
>>> project!
>>>
>>> Paulo, on behalf of the Apache Cassandra PMC
>>>
>>>
>>>
>>> --
>>> Dmitry Konstantinov
>>>
>>

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread David Capwell

Congrats!  Very deserving! 

> On Mar 4, 2025, at 3:09 PM, Brandon Williams  wrote:
> 
> Congratulations Ekaterina!
> 
> Kind Regards,
> Brandon
> 
> On Tue, Mar 4, 2025 at 2:25 PM Paulo Motta  wrote:
>> 
>> Aloha,
>> 
>> The Project Management Committee (PMC) for Apache Cassandra is delighted to 
>> announce that Ekaterina Dimitrova has joined the PMC!
>> 
>> Thanks a lot, Ekaterina, for everything you have done for the project all 
>> these years.
>> 
>> The PMC - Project Management Committee - manages and guides the direction of 
>> the project, and is responsible for inviting new committers and PMC members 
>> to steward the longevity of the project.
>> 
>> See https://community.apache.org/pmc/responsibilities.html if you're 
>> interested in learning more about the rights and responsibilities of PMC 
>> members.
>> 
>> Please join us in welcoming Ekaterina Dimitrova to her new role in our 
>> project!
>> 
>> Paulo, on behalf of the Apache Cassandra PMC

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

I've come around on the rsync vs built in topic, and I think this is the
same.  Having this managed in process gives us more options for control.

I think it's critical that 100% of the data be pushed to the object store.
I mentioned this in my email on Dec 15, but nobody directly responded to
that.  I keep seeing messages that sound like people only want a subset of
SSTables to live in the object store, I think that would be a massive
mistake.

Jon



On Tue, Mar 4, 2025 at 8:11 AM Jeff Jirsa  wrote:

> Mounted dirs give up the opportunity to change the IO model to account for
> different behaviors. The configurable channel proxy may suffer from the
> same IO constraints depending on implementation, too. But it may also
> become viable.
>
> The snapshot outside of the mounted file system seems like you’re
> implicitly implementing a one off in process backup. This is the same
> feedback I gave the sidecar with the rsync to another machine proposal. If
> we stop doing one off tactical projects and map out the actual problem
> we’re solving, we can get the right interfaces.  Also, you can probably
> just have the sidecar rsync thing do your snapshot to another directory on
> host.
>
> But if every sstable makes its way to s3, things like native backup,
> restoring from backup, recovering from local volumes can look VERY different
>
>
>
> On Mar 4, 2025, at 3:57 PM, Štefan Miklošovič 
> wrote:
>
> 
> For what it's worth, as it might come to somebody I am rejecting this
> altogether (which is not the case, all I am trying to say is that we should
> just think about it more) - it would be cool to know more about the
> experience of others when it comes to this, maybe somebody already tried to
> mount and it did not work as expected?
>
> On the other hand, there is this "snapshots outside data dir" effort I am
> doing and if we did it with this, then I can imagine that we could say "and
> if you deal with snapshots, use this proxy instead" which would
> transparently upload it to s3.
>
> Then we would not need to do anything at all, code-wise. We would not need
> to store snapshots "outside of data dir" just to be able to place it on a
> directory which is mounted as an s3 bucket.
>
> I don't know if it is possible to do it like that. Worth to explore I
> guess.
>
> I like mounted dirs for its simplicity and I guess that for copying files
> it might be just enough. Plus we would not need to add all s3 jars on CP
> either etc ...
>
> On Tue, Mar 4, 2025 at 2:46 PM Štefan Miklošovič 
> wrote:
>
>> I don't say that using remote object storage is useless.
>>
>> I am just saying that I don't see the difference. I have not measured
>> that but I can imagine that s3 mounted would use, under the hood, the same
>> calls to s3 api. How else would it be done? You need to talk to remote s3
>> storage eventually anyway. So why does it matter if we call s3 api from
>> Java or by other means from some "s3 driver"?  It is eventually using same
>> thing, no?
>>
>> On Tue, Mar 4, 2025 at 12:47 PM Jeff Jirsa  wrote:
>>
>>> Mounting an s3 bucket as a directory is an easy but poor implementation
>>> of object backed storage for databases
>>>
>>> Object storage is durable (most data loss is due to bugs not concurrent
>>> hardware failures), cheap (can 5-10x cheaper) and ubiquitous. A  huge
>>> number of modern systems are object-storage-only because the approximately
>>> infinite scale / cost / throughput tradeoffs often make up for the latency.
>>>
>>> Outright dismissing object storage for Cassandra is short sighted - it
>>> needs to be done in a way that makes sense, not just blindly copying over
>>> the block access patterns to object.
>>>
>>>
>>> On Mar 4, 2025, at 11:19 AM, Štefan Miklošovič 
>>> wrote:
>>>
>>> 
>>> I do not think we need this CEP, honestly. I don't want to diss this
>>> unnecessarily but if you mount a remote storage locally (e.g. mounting s3
>>> bucket as if it was any other directory on node's machine), then what is
>>> this CEP good for?
>>>
>>> Not talking about the necessity to put all dependencies to be able to
>>> talk to respective remote storage to Cassandra's class path, introducing
>>> potential problems with dependencies and their possible incompatibilities /
>>> different versions etc ...
>>>
>>> On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas 
>>> wrote:
>>>
 I’d love to see this implemented — where “this” is a proxy for some
 notion of support for remote object storage, perhaps usable by compaction
 strategies like TWCS to migrate data older than a threshold from a local
 filesystem to remote object.

 It’s not an area where I can currently dedicate engineering effort. But
 if others are interested in contributing a feature like this, I’d see it as
 valuable for the project and would be happy to collaborate on
 design/architecture/goals.

 – Scott

 On Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:

 
 Is anyone else interested

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Josh McKenzie

Welcome Ekaterina!  \o/

On Tue, Mar 4, 2025, at 7:07 PM, Francisco Guerrero wrote:
> Congratulations Ekaterina! Well deserved!
> 
> On 2025/03/04 20:25:08 Paulo Motta wrote:
> > Aloha,
> > 
> > The Project Management Committee (PMC) for Apache Cassandra is delighted to
> > announce that Ekaterina Dimitrova has joined the PMC!
> > 
> > Thanks a lot, Ekaterina, for everything you have done for the project all
> > these years.
> > 
> > The PMC - Project Management Committee - manages and guides the direction
> > of the project, and is responsible for inviting new committers and PMC
> > members to steward the longevity of the project.
> > 
> > See https://community.apache.org/pmc/responsibilities.html if you're
> > interested in learning more about the rights and responsibilities of PMC
> > members.
> > 
> > Please join us in welcoming Ekaterina Dimitrova to her new role in our
> > project!
> > 
> > Paulo, on behalf of the Apache Cassandra PMC
> > 
>

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2025-03-04 Thread Cheng Wang via dev

I agree with all the points mentioned by Soctt. We are actually very
interested to explore the tiered storage for the same reasons above. Our
first experiment with S3 single zone express was, unfortunately, awfully
slow compared to ephemeral and EBS.

On Tue, Mar 4, 2025 at 9:22 PM C. Scott Andreas 
wrote:

> To Jeff’s point on tactical vs. strategic, here’s the big picture for me
> on object storage:
>
> *– Object storage is 70% cheaper:*
> Replicated flash block storage is extremely expensive, and more so with
> compute resources constantly attached. If one were to build a storage
> platform on top of a cloud provider’s compute and storage infrastructure,
> selective use of object storage is essential to even being in the ballpark
> of managed offerings on price. EBS is 8¢/GB. S3 is 2.3¢/GB. It’s over 70%
> cheaper.
>
> *– Local/block storage is priced on storage *provisioned*. Object storage
> is priced on storage *consumed*:*
> It’s actually better than 70%. Local/block storage is priced based on the
> size of disks/volumes provisioned. While they may be resizable, resizing is
> generally inelastic. This typically produces a large gap between storage
> consumed vs. storage provisioned - and poor utilization. Object storage is
> typically priced on storage that is actually consumed.
>
> *– Object storage integration is the simplest path to complete decoupling
> of CPU and storage:*
> Block volumes are more fungible than local disk, but aren’t even close in
> flexibility to an SSTable that can be accessed by any function. Object is
> also the only sensible path to implementing a serverless database whose
> query facilities can be deployed on a function-as-a-service platform. That
> enables one to reach an idle compute cost of zero and an idle storage cost
> of 2.3¢/GB/month (S3).
>
> *– Object storage enables scale-to-zero:*
> Object storage integration is the only path for most databases to provide
> a scale-to-zero offering that doesn’t rely on keeping hot NVMe or block
> storage attached 24/7 while a database receives zero queries per second.
>
> *– Scale to zero is one of the easiest paths to zero marginal cost (the
> other is multitenancy - and not mutually exclusive):*
> Database platforms operated in a cluster-as-a-service model incur a
> constant fixed cost of provisioned resources regardless of whether they are
> in use. That’s fine for platforms that pass the full cost of resources
> consumed back to someone — but it produces poor economics and resource
> waste. Ability to scale to zero dramatically reduces the cost of
> provisioning and maintaining an un/underutilized database.
>
> *– It’s not all or nothing:*
> There are super sensible ways to pair local/block storage and object
> storage. One might be to store upper-level SSTable data components in
> object storage; and all other SSTable components (TOC, CompressionInfo,
> primary index, etc) on local flash. This gives you a way to rapidly
> enumerate and navigate SSTable metadata while only paying the cost of reads
> when fetching data (and possibly from upper-level SSTables only).
> Alternately, one could offload only older partitions of data in TWCS - time
> series data older than 30 days, a year, etc.
>
> *– Mounting a bucket as a filesystem is unusable:*
> Others have made this point. Naively mounting an object storage bucket as
> a filesystem produces uniformly *terrible* results. S3 time to first byte
> is in the 20-30ms+ range. S3 one-zone is closer to the 5-10ms range which
> is on par with the seek latency of a spinning disk. Despite C* originally
> being designed to operate well on spinning disks, a filesystem-like
> abstraction backed by object storage today will result in awful surprises
> due to IO patterns like those found by Jon H. and Jordan recently.
>
> *– Object Storage can unify “database” and “lakehouse”:*
> One could imagine Iceberg integration that enables manifest/snapshot-based
> querying of SSTables in an object store via Spark or similar platforms,
> with zero ETL or light cone contact with a production database process.
>
> The reason people care about object is that it’s 70%+ cheaper than flash -
> and 90%+ cheaper if the software querying it isn’t always running, too.
>
> – Scott
>
> —
> Mobile
>
> On Mar 4, 2025, at 12:29 PM, Štefan Miklošovič 
> wrote:
>
> 
> Jeff,
>
> when it comes to snapshots, there was already discussion in the other
> thread I am not sure you are aware of (1), here (2) I am talking about
> Sidecar + snapshots specifically. One "caveat" of Sidecar is that you
> actually _need_ sidecar if we ever contemplated Sidecar doing upload /
> backup (by whatever means).
>
> "s3 bucket as a mounted directory" bypasses the necessity of having
> Sidecar deployed. I do not want to repeat what I wrote in (2), all the
> reasoning is there.
>
> My primary motivation to do it by mounting is to 1) not use Sidecar if not
> needed 2) just save snapshot SSTables outside of table data dirs (which is
> imho com

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Bernardo Botella

Congratulations!!

On Tue, Mar 4, 2025 at 22:17 Berenguer Blasi 
wrote:

> Congrats Ekaterina!
> On 5/3/25 2:03, Jasonstack Zhao Yang wrote:
>
> Congratulations Ekaterina!
>
> On Wed, 5 Mar 2025 at 08:18, Josh McKenzie  wrote:
>
>> Welcome Ekaterina!  \o/
>>
>> On Tue, Mar 4, 2025, at 7:07 PM, Francisco Guerrero wrote:
>>
>> Congratulations Ekaterina! Well deserved!
>>
>> On 2025/03/04 20:25:08 Paulo Motta wrote:
>> > Aloha,
>> >
>> > The Project Management Committee (PMC) for Apache Cassandra is
>> delighted to
>> > announce that Ekaterina Dimitrova has joined the PMC!
>> >
>> > Thanks a lot, Ekaterina, for everything you have done for the project
>> all
>> > these years.
>> >
>> > The PMC - Project Management Committee - manages and guides the
>> direction
>> > of the project, and is responsible for inviting new committers and PMC
>> > members to steward the longevity of the project.
>> >
>> > See https://community.apache.org/pmc/responsibilities.html if you're
>> > interested in learning more about the rights and responsibilities of PMC
>> > members.
>> >
>> > Please join us in welcoming Ekaterina Dimitrova to her new role in our
>> > project!
>> >
>> > Paulo, on behalf of the Apache Cassandra PMC
>> >
>>
>>
>>

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2025-03-04 Thread C. Scott Andreas

To Jeff’s point on tactical vs. strategic, here’s the big picture for me on object storage:– Object storage is 70% cheaper:Replicated flash block storage is extremely expensive, and more so with compute resources constantly attached. If one were to build a storage platform on top of a cloud provider’s compute and storage infrastructure, selective use of object storage is essential to even being in the ballpark of managed offerings on price. EBS is 8¢/GB. S3 is 2.3¢/GB. It’s over 70% cheaper.– Local/block storage is priced on storage *provisioned*. Object storage is priced on storage *consumed*:It’s actually better than 70%. Local/block storage is priced based on the size of disks/volumes provisioned. While they may be resizable, resizing is generally inelastic. This typically produces a large gap between storage consumed vs. storage provisioned - and poor utilization. Object storage is typically priced on storage that is actually consumed.– Object storage integration is the simplest path to complete decoupling of CPU and storage:Block volumes are more fungible than local disk, but aren’t even close in flexibility to an SSTable that can be accessed by any function. Object is also the only sensible path to implementing a serverless database whose query facilities can be deployed on a function-as-a-service platform. That enables one to reach an idle compute cost of zero and an idle storage cost of 2.3¢/GB/month (S3).– Object storage enables scale-to-zero:Object storage integration is the only path for most databases to provide a scale-to-zero offering that doesn’t rely on keeping hot NVMe or block storage attached 24/7 while a database receives zero queries per second.– Scale to zero is one of the easiest paths to zero marginal cost (the other is multitenancy - and not mutually exclusive):Database platforms operated in a cluster-as-a-service model incur a constant fixed cost of provisioned resources regardless of whether they are in use. That’s fine for platforms that pass the full cost of resources consumed back to someone — but it produces poor economics and resource waste. Ability to scale to zero dramatically reduces the cost of provisioning and maintaining an un/underutilized database.– It’s not all or nothing:There are super sensible ways to pair local/block storage and object storage. One might be to store upper-level SSTable data components in object storage; and all other SSTable components (TOC, CompressionInfo, primary index, etc) on local flash. This gives you a way to rapidly enumerate and navigate SSTable metadata while only paying the cost of reads when fetching data (and possibly from upper-level SSTables only). Alternately, one could offload only older partitions of data in TWCS - time series data older than 30 days, a year, etc.– Mounting a bucket as a filesystem is unusable:Others have made this point. Naively mounting an object storage bucket as a filesystem produces uniformly *terrible* results. S3 time to first byte is in the 20-30ms+ range. S3 one-zone is closer to the 5-10ms range which is on par with the seek latency of a spinning disk. Despite C* originally being designed to operate well on spinning disks, a filesystem-like abstraction backed by object storage today will result in awful surprises due to IO patterns like those found by Jon H. and Jordan recently.– Object Storage can unify “database” and “lakehouse”:One could imagine Iceberg integration that enables manifest/snapshot-based querying of SSTables in an object store via Spark or similar platforms, with zero ETL or light cone contact with a production database process.The reason people care about object is that it’s 70%+ cheaper than flash - and 90%+ cheaper if the software querying it isn’t always running, too.– Scott—MobileOn Mar 4, 2025, at 12:29 PM, Štefan Miklošovič wrote:Jeff,when it comes to snapshots, there was already discussion in the other thread I am not sure you are aware of (1), here (2) I am talking about Sidecar + snapshots specifically. One "caveat" of Sidecar is that you actually _need_ sidecar if we ever contemplated Sidecar doing upload / backup (by whatever means)."s3 bucket as a mounted directory" bypasses the necessity of having Sidecar deployed. I do not want to repeat what I wrote in (2), all the reasoning is there.My primary motivation to do it by mounting is to 1) not use Sidecar if not needed 2) just save snapshot SSTables outside of table data dirs (which is imho completely fine requirement on its own, e.g. putting snapshots on a different / slow disk etc, why does it have to be on the same disk as data are?)The vibe I got from that thread was that what I was proposing is in general acceptable and I wanted to figure out the details as Blake was mentioning incremental backs as well. Other than that, I was thinking that we are pretty much settled on how that should be, that thread was up for a long time so I was thinking people in general do not have a problem with that.How I read your email, sp

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2025-03-04 Thread Rolo, Carlos via dev

Hello,

I would love to discuss this and provide feedback and design work for this. 
Since I'm not an experienced Java programmer I can't "hands-on" on the code. 
But I pick this up and try to carry it forward.

Carlos


From: guo Maxwell 
Sent: 26 February 2025 14:54
To: dev@cassandra.apache.org 
Subject: Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external 
storage locations

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



Is anyone else interested in continuing to discuss this topic?

guo Maxwell mailto:cclive1...@gmail.com>> 于2024年9月20日周五 
09:44写道：
I discussed this offline with Claude, he is no longer working on this.

It's a pity. I think this is a very valuable thing. Commitlog's archiving and 
restore may be able to use the relevant code if it is completed.

Patrick McFadin mailto:pmcfa...@gmail.com>>于2024年9月20日 
周五上午2:01写道：
Thanks for reviving this one!

On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell 
mailto:cclive1...@gmail.com>> wrote:
Is there any update on this topic?  It seems that things can make a big 
progress if  Jake Luciani  can find someone who can make the FileSystemProvider 
code accessible.

Jon Haddad mailto:j...@jonhaddad.com>> 于2023年12月16日周六 
05:29写道：
At a high level I really like the idea of being able to better leverage cheaper 
storage especially object stores like S3.

One important thing though - I feel pretty strongly that there's a big, deal 
breaking downside.   Backups, disk failure policies, snapshots and possibly 
repairs would get more complicated which haven't been particularly great in the 
past, and of course there's the issue of failure recovery being only partially 
possible if you're looking at a durable block store paired with an ephemeral 
one with some of your data not replicated to the cold side.  That introduces a 
failure case that's unacceptable for most teams, which results in needing to 
implement potentially 2 different backup solutions.  This is operationally 
complex with a lot of surface area for headaches.  I think a lot of teams would 
probably have an issue with the big question mark around durability and I 
probably would avoid it myself.

On the other hand, I'm +1 if we approach it something slightly differently - 
where _all_ the data is located on the cold storage, with the local hot storage 
used as a cache.  This means we can use the cold directories for the complete 
dataset, simplifying backups and node replacements.

For a little background, we had a ticket several years ago where I pointed out 
it was possible to do this *today* at the operating system level as long as 
you're using block devices (vs an object store) and LVM [1].  For example, this 
works well with GP3 EBS w/ low IOPS provisioning + local NVMe to get a nice 
balance of great read performance without going nuts on the cost for IOPS.  I 
also wrote about this in a little more detail in my blog [2].  There's also the 
new mount point tech in AWS which pretty much does exactly what I've suggested 
above [3] that's probably worth evaluating just to get a feel for it.

I'm not insisting we require LVM or the AWS S3 fs, since that would rule out 
other cloud providers, but I am pretty confident that the entire dataset should 
reside in the "cold" side of things for the practical and technical reasons I 
listed above.  I don't think it massively changes the proposal, and should 
simplify things for everyone.

Jon

[1] 
https://rustyrazorblade.com/post/2018/2018-04-24-intro-to-lvm/
[2] 
https://issues.apache.org/jira/browse/CASSANDRA-8460
[3] 
https://aws.amazon.com/about-aws/whats-new/2023/03/mountpoint-amazon-s3/


On Thu, Dec 14, 2023 at 1:56 AM Claude Warren 
mailto:cla...@apache.org>> wrote:
Is there still interest in this?  Can we get some points down on electrons so 
that we all understand the issues?

While it is fairly simple to redirect the read/write to something other  than 
the local system for a single node this will not solve the problem for tiered 
storage.

Tiered storage will require that on read/write the primary key be assessed and 
determine if the read/write should be redirected.  My reasoning for this 
statement is that in a cluster with a replication factor greater than 1 the 
node will store data for the keys that would be allocated to it in a cluster 
with a replication factor = 1, as well as some keys from nodes earlier in

Re: Welcome Aaron Ploetz as Cassandra Committer

Congratulations Aaron, happy to see you recognized as a committer!

Cheers,

Paulo

On Tue, 4 Mar 2025 at 03:26 Bernardo Botella 
wrote:

> That’s awesome!!
>
> Congratulations Aaron!! Long overdue for sure!
>
>
> On Mon, Mar 3, 2025 at 16:25 Patrick McFadin  wrote:
>
>> The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
>> accepted the invitation to become a committer!
>>
>> Aaron has been tireless in his mission to help every single Cassandra
>> operator on planet Earth. If you don't believe me, check out his Stack
>> Overflow profile page: https://stackoverflow.com/users/1054558/aaron
>> He's been a continuous speaker on Cassandra topics and is one of the
>> coordinators for the Planet Cassandra meetup. Those are just the
>> recent highlights.
>>
>> Please join us in congratulating and welcoming Aaron.
>>
>> The Apache Cassandra PMC members
>>
>

Re: Welcome Bernardo Botella as Cassandra Committer

Congrats Bernardo!!

On Tue, 4 Mar 2025 at 10:58, Berenguer Blasi 
wrote:

> Congrats!
>
> On 4/3/25 8:30, Štefan Miklošovič wrote:
> > The Project Management Committee (PMC) for Apache Cassandra has
> > invited Bernardo Botella to become a committer and we are pleased to
> > announce that he has accepted.
> >
> > Please join us in welcoming Bernardo Botella to his new role and
> > responsibility in our project community.
> >
> > Stefan Miklosovic
> >
> > On behalf of the Apache Cassandra PMC
>


-- 
Dmitry Konstantinov

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Josh McKenzie

Congrats Aaron!

On Tue, Mar 4, 2025, at 4:08 AM, Soheil Rahsaz wrote:
> Congratulations Aaron!
> 
> On Tue, Mar 4, 2025 at 12:09 PM Paulo Motta  wrote:
>> Congratulations Aaron, happy to see you recognized as a committer!
>> 
>> Cheers,
>> 
>> Paulo
>> 
>> On Tue, 4 Mar 2025 at 03:26 Bernardo Botella  
>> wrote:
>>> That’s awesome!!
>>> 
>>> Congratulations Aaron!! Long overdue for sure!
>>> 
>>> 
>>> On Mon, Mar 3, 2025 at 16:25 Patrick McFadin  wrote:
 The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
 accepted the invitation to become a committer!

 Aaron has been tireless in his mission to help every single Cassandra
 operator on planet Earth. If you don't believe me, check out his Stack
 Overflow profile page: https://stackoverflow.com/users/1054558/aaron
 He's been a continuous speaker on Cassandra topics and is one of the
 coordinators for the Planet Cassandra meetup. Those are just the
 recent highlights.

 Please join us in congratulating and welcoming Aaron.

 The Apache Cassandra PMC members

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Josh McKenzie

Congrats Bernardo - it's been great collaborating with you thus far and looking 
forward to more!

On Tue, Mar 4, 2025, at 4:13 AM, Paulo Motta wrote:
> ¡Felicitaciones Bernardo! Well deserved!
> 
> Cheers,
> 
> Paulo
> 
> On Tue, 4 Mar 2025 at 06:10 Soheil Rahsaz  wrote:
>> Congrats, Bernardo! Well deserved! 🎉 Looking forward to seeing your 
>> continued impact on the project.
>> 
>> 
>> 
>> 
>> On Tue, Mar 4, 2025 at 12:03 PM Dmitry Konstantinov  
>> wrote:
>>> Congrats Bernardo!!
>>> 
>>> On Tue, 4 Mar 2025 at 10:58, Berenguer Blasi  
>>> wrote:
 Congrats!
 
 On 4/3/25 8:30, Štefan Miklošovič wrote:
 > The Project Management Committee (PMC) for Apache Cassandra has 
 > invited Bernardo Botella to become a committer and we are pleased to 
 > announce that he has accepted.
 >
 > Please join us in welcoming Bernardo Botella to his new role and 
 > responsibility in our project community.
 >
 > Stefan Miklosovic
 >
 > On behalf of the Apache Cassandra PMC
>>> 
>>> 
>>> --
>>> Dmitry Konstantinov

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Soheil Rahsaz

Congratulations Aaron!

On Tue, Mar 4, 2025 at 12:09 PM Paulo Motta  wrote:

> Congratulations Aaron, happy to see you recognized as a committer!
>
> Cheers,
>
> Paulo
>
> On Tue, 4 Mar 2025 at 03:26 Bernardo Botella 
> wrote:
>
>> That’s awesome!!
>>
>> Congratulations Aaron!! Long overdue for sure!
>>
>>
>> On Mon, Mar 3, 2025 at 16:25 Patrick McFadin  wrote:
>>
>>> The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
>>> accepted the invitation to become a committer!
>>>
>>> Aaron has been tireless in his mission to help every single Cassandra
>>> operator on planet Earth. If you don't believe me, check out his Stack
>>> Overflow profile page: https://stackoverflow.com/users/1054558/aaron
>>> He's been a continuous speaker on Cassandra topics and is one of the
>>> coordinators for the Planet Cassandra meetup. Those are just the
>>> recent highlights.
>>>
>>> Please join us in congratulating and welcoming Aaron.
>>>
>>> The Apache Cassandra PMC members
>>>
>>

Re: Welcome Bernardo Botella as Cassandra Committer

2025-03-04 Thread Soheil Rahsaz

Congrats, Bernardo! Well deserved! 🎉 Looking forward to seeing your
continued impact on the project.


On Tue, Mar 4, 2025 at 12:03 PM Dmitry Konstantinov 
wrote:

> Congrats Bernardo!!
>
> On Tue, 4 Mar 2025 at 10:58, Berenguer Blasi 
> wrote:
>
>> Congrats!
>>
>> On 4/3/25 8:30, Štefan Miklošovič wrote:
>> > The Project Management Committee (PMC) for Apache Cassandra has
>> > invited Bernardo Botella to become a committer and we are pleased to
>> > announce that he has accepted.
>> >
>> > Please join us in welcoming Bernardo Botella to his new role and
>> > responsibility in our project community.
>> >
>> > Stefan Miklosovic
>> >
>> > On behalf of the Apache Cassandra PMC
>>
>
>
> --
> Dmitry Konstantinov
>

Re: Welcome Bernardo Botella as Cassandra Committer

¡Felicitaciones Bernardo! Well deserved!

Cheers,

Paulo

On Tue, 4 Mar 2025 at 06:10 Soheil Rahsaz  wrote:

> Congrats, Bernardo! Well deserved! 🎉 Looking forward to seeing your
> continued impact on the project.
>
>
> On Tue, Mar 4, 2025 at 12:03 PM Dmitry Konstantinov 
> wrote:
>
>> Congrats Bernardo!!
>>
>> On Tue, 4 Mar 2025 at 10:58, Berenguer Blasi 
>> wrote:
>>
>>> Congrats!
>>>
>>> On 4/3/25 8:30, Štefan Miklošovič wrote:
>>> > The Project Management Committee (PMC) for Apache Cassandra has
>>> > invited Bernardo Botella to become a committer and we are pleased to
>>> > announce that he has accepted.
>>> >
>>> > Please join us in welcoming Bernardo Botella to his new role and
>>> > responsibility in our project community.
>>> >
>>> > Stefan Miklosovic
>>> >
>>> > On behalf of the Apache Cassandra PMC
>>>
>>
>>
>> --
>> Dmitry Konstantinov
>>
>

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

I do not think we need this CEP, honestly. I don't want to diss this
unnecessarily but if you mount a remote storage locally (e.g. mounting s3
bucket as if it was any other directory on node's machine), then what is
this CEP good for?

Not talking about the necessity to put all dependencies to be able to talk
to respective remote storage to Cassandra's class path, introducing
potential problems with dependencies and their possible incompatibilities /
different versions etc ...

On Thu, Feb 27, 2025 at 6:21 AM C. Scott Andreas 
wrote:

> I’d love to see this implemented — where “this” is a proxy for some
> notion of support for remote object storage, perhaps usable by compaction
> strategies like TWCS to migrate data older than a threshold from a local
> filesystem to remote object.
>
> It’s not an area where I can currently dedicate engineering effort. But if
> others are interested in contributing a feature like this, I’d see it as
> valuable for the project and would be happy to collaborate on
> design/architecture/goals.
>
> – Scott
>
> On Feb 26, 2025, at 6:56 AM, guo Maxwell  wrote:
>
> 
> Is anyone else interested in continuing to discuss this topic?
>
> guo Maxwell  于2024年9月20日周五 09:44写道：
>
>> I discussed this offline with Claude, he is no longer working on this.
>>
>> It's a pity. I think this is a very valuable thing. Commitlog's archiving
>> and restore may be able to use the relevant code if it is completed.
>>
>> Patrick McFadin 于2024年9月20日 周五上午2:01写道：
>>
>>> Thanks for reviving this one!
>>>
>>> On Wed, Sep 18, 2024 at 12:06 AM guo Maxwell 
>>> wrote:
>>>
 Is there any update on this topic?  It seems that things can make a big
 progress if  Jake Luciani  can find someone who can make the
 FileSystemProvider code accessible.

 Jon Haddad  于2023年12月16日周六 05:29写道：

> At a high level I really like the idea of being able to better
> leverage cheaper storage especially object stores like S3.
>
> One important thing though - I feel pretty strongly that there's a
> big, deal breaking downside.   Backups, disk failure policies, snapshots
> and possibly repairs would get more complicated which haven't been
> particularly great in the past, and of course there's the issue of failure
> recovery being only partially possible if you're looking at a durable 
> block
> store paired with an ephemeral one with some of your data not replicated 
> to
> the cold side.  That introduces a failure case that's unacceptable for 
> most
> teams, which results in needing to implement potentially 2 different 
> backup
> solutions.  This is operationally complex with a lot of surface area for
> headaches.  I think a lot of teams would probably have an issue with the
> big question mark around durability and I probably would avoid it myself.
>
> On the other hand, I'm +1 if we approach it something slightly
> differently - where _all_ the data is located on the cold storage, with 
> the
> local hot storage used as a cache.  This means we can use the cold
> directories for the complete dataset, simplifying backups and node
> replacements.
>
> For a little background, we had a ticket several years ago where I
> pointed out it was possible to do this *today* at the operating system
> level as long as you're using block devices (vs an object store) and LVM
> [1].  For example, this works well with GP3 EBS w/ low IOPS provisioning +
> local NVMe to get a nice balance of great read performance without going
> nuts on the cost for IOPS.  I also wrote about this in a little more 
> detail
> in my blog [2].  There's also the new mount point tech in AWS which pretty
> much does exactly what I've suggested above [3] that's probably worth
> evaluating just to get a feel for it.
>
> I'm not insisting we require LVM or the AWS S3 fs, since that would
> rule out other cloud providers, but I am pretty confident that the entire
> dataset should reside in the "cold" side of things for the practical and
> technical reasons I listed above.  I don't think it massively changes the
> proposal, and should simplify things for everyone.
>
> Jon
>
> [1] https://rustyrazorblade.com/post/2018/2018-04-24-intro-to-lvm/
> [2] https://issues.apache.org/jira/browse/CASSANDRA-8460
> [3]
> https://aws.amazon.com/about-aws/whats-new/2023/03/mountpoint-amazon-s3/
>
>
> On Thu, Dec 14, 2023 at 1:56 AM Claude Warren 
> wrote:
>
>> Is there still interest in this?  Can we get some points down on
>> electrons so that we all understand the issues?
>>
>> While it is fairly simple to redirect the read/write to something
>> other  than the local system for a single node this will not solve the
>> problem for tiered storage.
>>
>> Tiered storage will require that on read/write the primary key be
>

Community Over Code Asia Travel Assistance Applications now open!

Hi,

The Travel Assistance Committee (TAC) are pleased to announce that
travel assistance applications for Community over Code Asia 2025 are now
open!

We will be supporting Community over Code Asia,  Beijing, China
July 25th to the 27th 2025.

TAC exists to help those that would like to attend Community over Code
events, but are unable to do so for financial reasons. For more info
on this year's applications and qualifying criteria, please visit the
TAC website at < https://tac.apache.org/ >.
Applications are already open on https://tac-apply.apache.org/, so don't
delay!

The Apache Travel Assistance Committee will only be accepting
applications from those people that are able to attend the full event.

Important: Applications close on Friday 9th May, 2025.

Applicants have until the the closing date above to submit their
applications (which should contain as much supporting material as
required to efficiently and accurately process their request), this
will enable TAC to announce successful applications shortly
afterwards.

As usual, TAC expects to deal with a range of applications from a
diverse range of backgrounds; therefore, we encourage (as always)
anyone thinking about sending in an application to do so ASAP.

For those that will need a Visa to enter the Country - we advise you apply
now so that you have enough time in case of interview delays. So do not
wait until you know if you have been accepted or not.

We look forward to greeting many of you in Beijing China, July 2025!

Kind Regards,

Paulo

(On behalf of the Travel Assistance Committee)

PHP Client Driver

2025-03-04 Thread Michael Roosz

Hello Cassandra developers,

since the official Cassandra php client driver list currently only contains
outdated and unmaintained projects, could you please consider adding my php
client driver to this list?
https://cassandra.apache.org/doc/latest/cassandra/getting-started/drivers.html#php
https://github.com/MichaelRoosz/php-cassandra

Best Regards,
Michael

Re: [DISCUSS] synchronisation of properties between Config.java and cassandra.yaml

>>
https://docs.google.com/spreadsheets/d/11MOxhNqwE1tWP4ex2gzKG2pmeAWFaHDKo-CRp25h9BU/edit?gid=0#gid=0
We still have a lot of rows empty. I have added many default values and a
Cassandra version when a parameter was introduced (to differentiate some
recent parameters from old ones) based on source code but it would be nice
to get a description for parameters from the authors as well as
classification exposed/hidden.
Maybe we should not wait for collecting info about all parameters and
update what we have + use a threshold in the Ant validation task to fail
when new missed parameters are added. The logic in the dev branch here
https://issues.apache.org/jira/browse/CASSANDRA-20249 already supports a
threshold.


On Tue, 28 Jan 2025 at 00:04, Josh McKenzie  wrote:

> Good point re: the implications of parsing and durability in the face of
> seeing unknown or missing parameters. I don't think widening the scope on
> that would be ideal, especially considering the entire impetus for this
> conversation is "we've misbehaved with our config and have a bunch of
> undocumented stuff we're not sure is still useful, or what it's for". =/
>
> On Mon, Jan 27, 2025, at 3:41 PM, Štefan Miklošovič wrote:
>
> "we take "unclaimed" items and move them to their own InternalConfig.java
> or something"
>
> This is interesting. If we are meant to be still able to put these
> properties into cassandra.yaml (even they are "internal ones") and they
> would be just in InternalConfig.java for some basic separation of internal
> / user-facing configuration, then we would need to have two yaml loaders:
>
> 1) the one as we have now which loads cassandra.yaml it into Config.java
> 2) the second one which would load cassandra.yaml into InternalConfig.java
>
> For both cases, we could not fail when there are unrecognized properties
> in cassandra.yaml while parsing it (1), because every loader, for
> Config.java as well as InternalConfig.java, is parsing just some "subset"
> of yaml.
>
> (1)
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java#L443-L444
>
> If we just had "public InternalConfig internal = new InternalConfig" as a
> field in Config.java, then this would lead to properties being effectively
> renamed in cassandra.yaml like
>
> internal:
> some_currently_internal_property: false
>
> instead of just
>
> some_currently_internal_property: false
>
> I do not think we want to have them renamed / under different
> configuration sections in yaml. I get that they are "internal" etc but we
> just don't know how / where it is used and deployed and just blindly
> renaming them is not a good idea imho.
>
> On Mon, Jan 27, 2025 at 8:46 PM Josh McKenzie 
> wrote:
>
>
> This may be an off-base comparison, but this reminds me of struggles we've
> had getting to 0 failing unit tests before and the debates on fencing off a
> snapshot of the current "failure set" so you can have a set point where no
> further degradation is allowed in a primary data set.
>
> All of which is to say - maybe at the end of the spreadsheet, we take
> "unclaimed" items and move them to their own InternalConfig.java or
> something and add an ant target that a) disallows further addition to
> InternalConfig.java w/out throwing an error / needing whitelist update, and
> b) disallows further regression in the Config.java <-> cassandra.yaml
> relationship for non-annotated fields.
>
> That way we can at least halt the progression of the disease even if we're
> stymied on cleaning up some of the existing symptoms.
>
> On Mon, Jan 27, 2025, at 1:38 PM, Štefan Miklošovič wrote:
>
> Indeed, we need to balance that and thoughtfully choose what is going to
> be added and what not. However, we should not hide something which is meant
> to be tweaked by a user. The config is intimidating mostly because
> everything is just in one file. I merely remember discussions a few years
> ago which were about splitting cassandra.yaml into multiple files which
> would be focused just on one subsystem / would cover some logically
> isolated domain.
>
> Anyway, I think the main goal of this effort for now would be to at least
> map where we are at. Some of them are genuinely missing. E.g. guardrails,
> how is a user meant to know about that if it is not even documented ...
>
> On Mon, Jan 27, 2025 at 6:16 PM Chris lohfink  wrote:
>
> Might be a bit of a balance between exposing what people actually are
> likely to need to modify vs having a super intimidating config file. It's
> already nearly 2000 lines. Personally I'd rather see some
> auto-documentation or something that's in the docs
> 
> than an effort to manually add another 1000 lines.
>
> Chris
>
> On Fri, Jan 24, 2025 at 9:41 AM Dmitry Konstantinov 
> wrote:
>
> Maybe I missed some patterns but it looks like a pretty good estimation, I
> did like 10 random checks manually to

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Francisco Guerrero

Congratulations Aaron!

On 2025/03/04 00:23:49 Patrick McFadin wrote:
> The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
> accepted the invitation to become a committer!
> 
> Aaron has been tireless in his mission to help every single Cassandra
> operator on planet Earth. If you don't believe me, check out his Stack
> Overflow profile page: https://stackoverflow.com/users/1054558/aaron
> He's been a continuous speaker on Cassandra topics and is one of the
> coordinators for the Planet Cassandra meetup. Those are just the
> recent highlights.
> 
> Please join us in congratulating and welcoming Aaron.
> 
> The Apache Cassandra PMC members
>

Re: CEP-15 Update

2025-03-04 Thread Blake Eggleston

Thanks Benedict!

I’m really excited to see accord reach this milestone, even with these caveats. 
You seem to have left yourself off the list of contributors though, even though 
you’ve been a central figure in its development :) So thanks to all accord & 
tcm contributors, including Benedict, for making this possible!

On Tue, Mar 4, 2025, at 8:00 AM, Benedict Elliott Smith wrote:
> Hi everyone,
> 
> It’s been exactly 3.5 years since the first commit to cassandra-accord. Yes, 
> really, it’s been that long.
> 
> We will be starting to validate the feature against real workloads in the 
> near future, so we can’t sensibly push off merging much longer. The following 
> is a brief run-down of the state of play. There are no known bugs, but there 
> remain a number of caveats we will be incrementally addressing in the run-up 
> to a full release:
> 
> [1] Accord is likely to be SLOW until further optimisations are implemented
> [2] Schema changes have a number of hard edges
> [3] Validation is ongoing, so there are likely still a number of bugs to 
> shake out
> [4] Many operator visibility/tooling/documentation improvements are pending
> 
> To expand a little: 
> 
> [1] As of the last experiment we conducted, accord’s throughput was poor - 
> also leading to higher LAN latencies. We have done no WAN experiments to 
> date, but the protocol guarantees should already achieve better round-trip 
> performance, in particular under contention. Improving throughput will be the 
> main focus of attention once we are satisfied the protocol is otherwise 
> stable, but our focus remains validation for the moment.
> [2] Schema changes have not yet been well integrated with TCM. Dropping a 
> table for instance will currently cause problems if nodes are offline.
> [3] We have a range of validations we are already performing against 
> cassandra-accord directly, and against its integration with Cassandra in 
> cep-15-accord. We have run hundreds of billions of simulated transactions, 
> and are still discovering some minor fault every few billion simulated 
> transactions or so. There remains a lot more simulated validation to explore, 
> as well as with real clusters serving real workloads.
> [4] There are already a range of virtual tables for exploring internal state 
> in Accord, and reasonably good metric support. However, tracing is not yet 
> supported, and our metric and virtual table integrations need some further 
> development.
> [5] There are also other edge cases to address such as ensuring we do not 
> reuse HLCs after restart, supporting ByteOrderPartitioner, and live migration 
> from/to Paxos is undergoing fine-tuning and validation; probably there are 
> some other things I am forgetting.
> 
> Altogether the feature is fairly mature, despite these caveats. This is the 
> fruit of the labour of a long list of contributors, including Aleksey 
> Yeschenko, Alex Petrov, Ariel Weisberg, Blake Eggleston, Caleb Rackliffe and 
> David Capwell, and represents a huge undertaking. It also wouldn’t have been 
> possible without the work of Alex Petrov, Marcus Eriksson and Sam Tunnicliffe 
> on delivering transactional cluster metadata. I hope you will join me in 
> thanking them all for their contributions.
> 
> Alex has also kindly produced some initial overview documentation for 
> developers, that can be found here: 
> https://github.com/apache/cassandra/blob/cep-15-accord/doc/modules/cassandra/pages/developing/accord/index.adoc.
>  This will be expanded as time permits.
> 
> Does anyone have any questions or concerns?

Dropwizard/Codahale metrics deprecation in Cassandra server

Hi all,

After a long conversation with Benedict and Maxim in CASSANDRA-20250
 I would like to
raise and discuss a proposal to deprecate Dropwizard/Codahale metrics usage
in the next major release of Cassandra server and drop it in the following
major release.
Instead of it our own Java API and implementation should be introduced. For
the next major release Dropwizard/Codahale API is still planned to support
by extending Codahale implementations, to give potential users of this API
enough time for transition.
The proposal does not affect JMX API for metrics, it is only about local
Java API changes within Cassandra server classpath, so it is about the
cases when somebody outside of Cassandra server code relies on Codahale API
in some kind of extensions or agents.

Reasons:
1) Codahale metrics implementation is not very efficient from CPU and
memory usage point of view. In the past we already replaced default
Codahale implementations for Reservoir with our custom one and now in
CASSANDRA-20250  we
(Benedict and I) want to add a more efficient implementation for Counter
and Meter logic. So, in total we do not have so much logic left from the
original library (mostly a MetricRegistry as container for metrics) and the
majority of logic is implemented by ourselves.
We use metrics a lot along the read and write paths and they contribute a
visible overhead (for example for plain write load it is about 9-11%
according to async profiler CPU profile), so we want them to be highly
optimized.
>From memory perspective Counter and Meter are built based on LongAdder and
they are quite heavy for the amounts which we create and use.

2) Codahale metrics does not provide any way to replace Counter and Meter
implementations. There are no full functional interfaces for these
entities + MetricRegistry has casts/checks to implementations and cannot
work with anything else.
I looked through the already reported issues and found the following
similar and unsuccessful attempt to introduce interfaces for metrics:
https://github.com/dropwizard/metrics/issues/2186
as well as other older attempts:
https://github.com/dropwizard/metrics/issues/252
https://github.com/dropwizard/metrics/issues/264
https://github.com/dropwizard/metrics/issues/703
https://github.com/dropwizard/metrics/pull/487
https://github.com/dropwizard/metrics/issues/479
https://github.com/dropwizard/metrics/issues/253

So, the option to request an extensibility from Codahale metrics does not
look real..

3) It looks like the library is in maintenance mode now, 5.x version is on
hold and many integrations are also not so alive.
The main benefit to use Codahale metrics should be a huge amount of
reporters/integrations but if we check carefully the list of reporters
mentioned here:
https://metrics.dropwizard.io/4.2.0/manual/third-party.html#reporters
we can see that almost all of them are dead/archived.

4) In general, exposing other 3rd party libraries as our own public API
frequently creates too many limitations and issues (Guava is another
typical example which I saw previously, it is easy to start but later you
struggle more and more).

Does anyone have any questions or concerns regarding this suggestion?
-- 
Dmitry Konstantinov

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

Congrats Ekaterina!!

On Tue, 4 Mar 2025 at 23:25, Paulo Motta  wrote:

> Aloha,
>
> The Project Management Committee (PMC) for Apache Cassandra is delighted
> to announce that Ekaterina Dimitrova has joined the PMC!
>
> Thanks a lot, Ekaterina, for everything you have done for the project all
> these years.
>
> The PMC - Project Management Committee - manages and guides the direction
> of the project, and is responsible for inviting new committers and PMC
> members to steward the longevity of the project.
>
> See https://community.apache.org/pmc/responsibilities.html if you're
> interested in learning more about the rights and responsibilities of PMC
> members.
>
> Please join us in welcoming Ekaterina Dimitrova to her new role in our
> project!
>
> Paulo, on behalf of the Apache Cassandra PMC
>


-- 
Dmitry Konstantinov

Re: Welcome Ekaterina Dimitrova as Cassandra PMC member

2025-03-04 Thread Rahul Singh (ANANT)

 Congrats Ekaterina!

Sent via Superhuman iOS 


On Tue, Mar 4, 2025 at 3:57 PM, Aaron  wrote:

> Welcome Ekaterina! Congratulations!!!
>
> On Tue, Mar 4, 2025 at 2:50 PM Yifan Cai  wrote:
>
>> Congratulations!
>> --
>> *From:* Dmitry Konstantinov 
>> *Sent:* Tuesday, March 4, 2025 12:40:48 PM
>> *To:* dev@cassandra.apache.org 
>> *Subject:* Re: Welcome Ekaterina Dimitrova as Cassandra PMC member
>>
>> Congrats Ekaterina!!
>>
>> On Tue, 4 Mar 2025 at 23:25, Paulo Motta  wrote:
>>
>> Aloha,
>>
>> The Project Management Committee (PMC) for Apache Cassandra is delighted
>> to announce that Ekaterina Dimitrova has joined the PMC!
>>
>> Thanks a lot, Ekaterina, for everything you have done for the project all
>> these years.
>>
>> The PMC - Project Management Committee - manages and guides the direction
>> of the project, and is responsible for inviting new committers and PMC
>> members to steward the longevity of the project.
>>
>> See https://community.apache.org/pmc/responsibilities.html if you're
>> interested in learning more about the rights and responsibilities of PMC
>> members.
>>
>> Please join us in welcoming Ekaterina Dimitrova to her new role in our
>> project!
>>
>> Paulo, on behalf of the Apache Cassandra PMC
>>
>>
>>
>> --
>> Dmitry Konstantinov
>>
>

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

I've got a few thoughts...

On the performance side, I took a look at a few CPU profiles from past
benchmarks and I'm seeing DropWizard taking ~ 3% of CPU time.  Is there a
specific workload you're running where you're seeing it take up a
significant % of CPU time?  Could you share some metrics, profile data, or
a workload so I can try to reproduce your findings?  In my testing I've
found the majority of the overhead from metrics to come from JMX, not
DropWizard.

On the operator side, inventing our own metrics lib means risks making it
harder to instrument Cassandra.  There are libraries out there that allow
you to tap into DropWizard metrics directly.  For example, Sarma Pydipally
did a presentation on this last year [1] based on some code I threw
together.

If you're planning on making it easier to instrument C* by supporting
sending metrics to the OTel collector [2], then I could see the change
being a net win as long as the perf is no worse than the status quo.

It's hard to know the full extent of what you're planning and the impact,
so I'll save any opinions till I know more about the plan.

Thanks for bringing this up!
Jon

[1]
https://planetcassandra.org/leaf/apache-cassandra-lunch-62-grafana-dashboard-for-apache-cassandra-business-platform-team/
[2] https://opentelemetry.io/docs/collector/

On Tue, Mar 4, 2025 at 12:40 PM Dmitry Konstantinov 
wrote:

> Hi all,
>
> After a long conversation with Benedict and Maxim in CASSANDRA-20250
>  I would like to
> raise and discuss a proposal to deprecate Dropwizard/Codahale metrics usage
> in the next major release of Cassandra server and drop it in the following
> major release.
> Instead of it our own Java API and implementation should be introduced.
> For the next major release Dropwizard/Codahale API is still planned to
> support by extending Codahale implementations, to give potential users of
> this API enough time for transition.
> The proposal does not affect JMX API for metrics, it is only about local
> Java API changes within Cassandra server classpath, so it is about the
> cases when somebody outside of Cassandra server code relies on Codahale API
> in some kind of extensions or agents.
>
> Reasons:
> 1) Codahale metrics implementation is not very efficient from CPU and
> memory usage point of view. In the past we already replaced default
> Codahale implementations for Reservoir with our custom one and now in
> CASSANDRA-20250  we
> (Benedict and I) want to add a more efficient implementation for Counter
> and Meter logic. So, in total we do not have so much logic left from the
> original library (mostly a MetricRegistry as container for metrics) and the
> majority of logic is implemented by ourselves.
> We use metrics a lot along the read and write paths and they contribute a
> visible overhead (for example for plain write load it is about 9-11%
> according to async profiler CPU profile), so we want them to be highly
> optimized.
> From memory perspective Counter and Meter are built based on LongAdder and
> they are quite heavy for the amounts which we create and use.
>
> 2) Codahale metrics does not provide any way to replace Counter and Meter
> implementations. There are no full functional interfaces for these
> entities + MetricRegistry has casts/checks to implementations and cannot
> work with anything else.
> I looked through the already reported issues and found the following
> similar and unsuccessful attempt to introduce interfaces for metrics:
> https://github.com/dropwizard/metrics/issues/2186
> as well as other older attempts:
> https://github.com/dropwizard/metrics/issues/252
> https://github.com/dropwizard/metrics/issues/264
> https://github.com/dropwizard/metrics/issues/703
> https://github.com/dropwizard/metrics/pull/487
> https://github.com/dropwizard/metrics/issues/479
> https://github.com/dropwizard/metrics/issues/253
>
> So, the option to request an extensibility from Codahale metrics does not
> look real..
>
> 3) It looks like the library is in maintenance mode now, 5.x version is on
> hold and many integrations are also not so alive.
> The main benefit to use Codahale metrics should be a huge amount of
> reporters/integrations but if we check carefully the list of reporters
> mentioned here:
> https://metrics.dropwizard.io/4.2.0/manual/third-party.html#reporters
> we can see that almost all of them are dead/archived.
>
> 4) In general, exposing other 3rd party libraries as our own public API
> frequently creates too many limitations and issues (Guava is another
> typical example which I saw previously, it is easy to start but later you
> struggle more and more).
>
> Does anyone have any questions or concerns regarding this suggestion?
> --
> Dmitry Konstantinov
>

Re: Welcome Aaron Ploetz as Cassandra Committer

2025-03-04 Thread Rahul Singh (ANANT)

Wooot woot Congrats Aaron

Sent via Superhuman 


On Tue, Mar 04, 2025 at 12:24 PM, Francisco Guerrero 
wrote:

> Congratulations Aaron!
>
> On 2025/03/04 00:23:49 Patrick McFadin wrote:
>
> The Apache Cassandra PMC is very happy to announce that Aaron Ploetz has
> accepted the invitation to become a committer!
>
> Aaron has been tireless in his mission to help every single Cassandra
> operator on planet Earth. If you don't believe me, check out his Stack
> Overflow profile page: https://stackoverflow.com/users/1054558/aaron He's
> been a continuous speaker on Cassandra topics and is one of the
> coordinators for the Planet Cassandra meetup. Those are just the recent
> highlights.
>
> Please join us in congratulating and welcoming Aaron.
>
> The Apache Cassandra PMC members
>
>

Re: CEP-15 Update

2025-03-04 Thread Dinesh Joshi

Thank you for the update and a BIG thanks to all involved in getting us to
this milestone. Looking forward to this work being merged in so we can kick
the tires and help surface any issues early.

On Tue, Mar 4, 2025 at 8:01 AM Benedict Elliott Smith 
wrote:

> Hi everyone,
>
> It’s been exactly 3.5 years since the first commit to cassandra-accord.
> Yes, really, it’s been that long.
>
> We will be starting to validate the feature against real workloads in the
> near future, so we can’t sensibly push off merging much longer. The
> following is a brief run-down of the state of play. There are no known
> bugs, but there remain a number of caveats we will be incrementally
> addressing in the run-up to a full release:
>
> [1] Accord is likely to be SLOW until further optimisations are implemented
> [2] Schema changes have a number of hard edges
> [3] Validation is ongoing, so there are likely still a number of bugs to
> shake out
> [4] Many operator visibility/tooling/documentation improvements are pending
>
> To expand a little:
>
> [1] As of the last experiment we conducted, accord’s throughput was poor -
> also leading to higher LAN latencies. We have done no WAN experiments to
> date, but the protocol guarantees should already achieve better round-trip
> performance, in particular under contention. Improving throughput will be
> the main focus of attention once we are satisfied the protocol is otherwise
> stable, but our focus remains validation for the moment.
> [2] Schema changes have not yet been well integrated with TCM. Dropping a
> table for instance will currently cause problems if nodes are offline.
> [3] We have a range of validations we are already performing against
> cassandra-accord directly, and against its integration with Cassandra in
> cep-15-accord. We have run hundreds of billions of simulated transactions,
> and are still discovering some minor fault every few billion simulated
> transactions or so. There remains a lot more simulated validation to
> explore, as well as with real clusters serving real workloads.
> [4] There are already a range of virtual tables for exploring internal
> state in Accord, and reasonably good metric support. However, tracing is
> not yet supported, and our metric and virtual table integrations need some
> further development.
> [5] There are also other edge cases to address such as ensuring we do not
> reuse HLCs after restart, supporting ByteOrderPartitioner, and live
> migration from/to Paxos is undergoing fine-tuning and validation; probably
> there are some other things I am forgetting.
>
> Altogether the feature is fairly mature, despite these caveats. This is
> the fruit of the labour of a long list of contributors, including Aleksey
> Yeschenko, Alex Petrov, Ariel Weisberg, Blake Eggleston, Caleb Rackliffe
> and David Capwell, and represents a huge undertaking. It also wouldn’t have
> been possible without the work of Alex Petrov, Marcus Eriksson and Sam
> Tunnicliffe on delivering transactional cluster metadata. I hope you will
> join me in thanking them all for their contributions.
>
> Alex has also kindly produced some initial overview documentation for
> developers, that can be found here:
> https://github.com/apache/cassandra/blob/cep-15-accord/doc/modules/cassandra/pages/developing/accord/index.adoc.
> This will be expanded as time permits.
>
> Does anyone have any questions or concerns?

Re: CEP-15 Update