> - Resources isolation. Having the said service running within the same JVM 
> may negatively impact Cassandra storage's performance. It could be more 
> beneficial to have them in Sidecar, which offers strong resource isolation 
> guarantees.

How does having this in a side car change the impact on “storage performance”?  
The side car reading sstables will have the same impact on storage IO as the 
main process reading sstables.  Given the sidecar is running on the same node 
as the main C* process, the only real resource isolation you have is in 
heap/GC?  CPU/Memory/IO are all still shared between the main C* process and 
the side car, and coordinating those across processes is harder than 
coordinating them within a single process.  For example if we wanted to have 
the compaction throughput, streaming throughput, and analytics read throughput 
all tied back to a single disk IO cap, that is harder with an external process.

> - Complexity. Considering the existence of the Sidecar project, it would be 
> less complex to avoid adding another (http?) service in Cassandra.

Not sure that is really very complex, running an http service is a pretty easy? 
 We already have netty in use to instantiate one from.
I worry more about the complexity of having the matching schema for a set of 
sstables being read.  The complexity of new sstable versions/formats being 
introduced.  The complexity of having up to date data from memtables being 
considered by this API without having to flush before every query of it.  The 
complexity of dealing with the new memtable API introduced in CEP-11.  The 
complexity of coordinating compaction/streaming adding and removing files with 
these APIs reading them.  There are a lot of edge cases to consider for this 
external access to sstables that the main process considers itself the “owner” 
of.

All of this is not to say that I think separating things out into other 
processes/services is bad.  But I think we need to be very careful with how we 
do it, or end users will end up running into all the sharp edges and the 
feature will fail.

-Jeremiah

> On Mar 24, 2023, at 8:15 PM, Yifan Cai <yc25c...@gmail.com> wrote:
> 
> Hi Jeremiah, 
> 
> There are good reasons to not have these inside Cassandra. Consider the 
> following.
> - Resources isolation. Having the said service running within the same JVM 
> may negatively impact Cassandra storage's performance. It could be more 
> beneficial to have them in Sidecar, which offers strong resource isolation 
> guarantees.
> - Availability. If the Cassandra cluster is being bounced, using sidecar 
> would not affect the SBR/SBW functionality, e.g. SBR can still read SSTables 
> via sidecar endpoints. 
> - Compatibility. Sidecar provides stable REST-based APIs, such as uploading 
> SSTables endpoint, which would remain compatible with different versions of 
> Cassandra. The current implementation supports versions 3.0 and 4.0.
> - Complexity. Considering the existence of the Sidecar project, it would be 
> less complex to avoid adding another (http?) service in Cassandra.
> - Release velocity. Sidecar, as an independent project, can have a quicker 
> release cycle from Cassandra. 
> - The features in sidecar are mostly implemented based on various existing 
> tools/APIs exposed from Cassandra, e.g. ring, commit sstable, snapshot, etc.
> 
> Regarding authentication and authorization
> - We will add it as a follow-on CEP in Sidecar, but we don't want to hold up 
> this CEP. It would be a feature that benefits all Sidecar endpoints.
> 
> - Yifan
> 
> On Fri, Mar 24, 2023 at 2:43 PM Doug Rohrer <droh...@apple.com 
> <mailto:droh...@apple.com>> wrote:
>> I agree that the analytics library will need to support vnodes. To be clear, 
>> there’s nothing preventing the solution from working with vnodes right now, 
>> and no assumptions about a 1:1 topology between a token and a node. However, 
>> we don’t, today, have the ability to test vnode support end-to-end. We are 
>> working towards that, however, and should be able to remove the caveat from 
>> the released analytics library once we can properly test vnode support.
>> If it helps, I can update the CEP to say something more like “Caveat: 
>> Currently untested with vnodes - work is ongoing to remove this limitation” 
>> if that helps?
>> 
>> Doug
>> 
>> > On Mar 24, 2023, at 11:43 AM, Brandon Williams <dri...@gmail.com 
>> > <mailto:dri...@gmail.com>> wrote:
>> > 
>> > On Fri, Mar 24, 2023 at 10:39 AM Jeremiah D Jordan
>> > <jeremiah.jor...@gmail.com <mailto:jeremiah.jor...@gmail.com>> wrote:
>> >> 
>> >> I have concerns with the majority of this being in the sidecar and not in 
>> >> the database itself.  I think it would make sense for the server side of 
>> >> this to be a new service exposed by the database, not in the sidecar.  
>> >> That way it can be able to properly integrate with the authentication and 
>> >> authorization apis, and to make it a first class citizen in terms of 
>> >> having unit/integration tests in the main DB ensuring no one breaks it.
>> > 
>> > I don't think this can/should happen until it supports the database's
>> > default configuration with vnodes.
>> 

Reply via email to