Re: [DISCUSS] CEP-56 Spark Bulk Reading from Cassandra Backup Uploaded to Object Storage

James Berragan Tue, 07 Oct 2025 10:52:52 -0700

+1! This is something I always hoped we would get to with Analytics. +1 on
creating a new DataLayer, it would be good to flesh out in a bit more
detail how the SSTableKey will keep it flexible for different backup
layouts. I think something also to consider is that many people will
encrypt backups in S3 (sometimes at an individual SSTable or file level).


On Tue, 7 Oct 2025 at 10:24, Liu Cao <[email protected]> wrote:

> Hi fellow Cassandra devs,
>
> I'd like to propose CEP-56: Spark Bulk Reading from Cassandra Backup
> Uploaded to Object Storage
>
> This is about enabling cassandra-analytics to perform bulk reading from
> Cassandra snapshot backups stored in object storage like S3. This approach
> aims to decouple bulk reading from the online cassandra cluster (including
> side-car), providing full isolation and predictable performance for
> analytics.
>
> The initial object storage support would be AWS S3. Please help review the
> new public interfaces and example usage and provide any feedback as needed.
>
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-56%3A+Spark+Bulk+Reading+from+Cassandra+Backup+Uploaded+to+Object+Storage
>
>
> Best Regards,
>
> --
>
> Liu Cao
>
>
>

Re: [DISCUSS] CEP-56 Spark Bulk Reading from Cassandra Backup Uploaded to Object Storage

Reply via email to