Re: One big giant cluster or several smaller ones?

Jeff Jirsa Fri, 12 Nov 2021 14:01:45 -0800

Oh sorry - a cluster per application makes sense. Sharding within an
application makes sense to avoid very very very large clusters (think:
~thousand nodes). 1 cluster per app/use case.


On Fri, Nov 12, 2021 at 1:39 PM S G <sg.online.em...@gmail.com> wrote:

> Thanks Jeff.
> Any side-effect on the client config from small clusters perspective?
>
> Like several smaller clusters means more CassandraClient objects on the
> client side but I guess number of connections shall remain the same as
> number of physical nodes will most likely remain the same only. So I think
> client side would not see any major issue.
>
>
> On Fri, Nov 12, 2021 at 11:46 AM Jeff Jirsa <jji...@gmail.com> wrote:
>
>> Most people are better served building multiple clusters and spending
>> their engineering time optimizing for maintaining multiple clusters, vs
>> spending their engineering time learning how to work around the sharp edges
>> that make large shared clusters hard.
>>
>> Large multi-tenant clusters give you less waste and a bit more elasticity
>> (one tenant can burst and use spare capacity that would typically be left
>> for the other tenants). However, one bad use case / table can ruin
>> everything (one bad read that generates GC hits all use cases), and
>> eventually certain mechanisms/subsystems dont scale past certain points
>> (e.g. schema - large schemas and large clusters are much harder than small
>> schemas and small clusters)
>>
>>
>>
>>
>> On Fri, Nov 12, 2021 at 11:31 AM S G <sg.online.em...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> Is there any case where we would prefer one big giant cluster (with
>>> multiple large tables) over several smaller clusters?
>>> Apart from some management overhead of multiple Cassandra Clients, it
>>> seems several smaller clusters are always better than a big one:
>>>
>>>    1. Avoids SPOF for all tables
>>>    2. Helps debugging (less noise from all tables in the logs)
>>>    3. Traffic spikes on one table do not affect others if they are in
>>>    different tables.
>>>    4. We can scale tables independently of each other - so colder data
>>>    can be in a smaller cluster (more data/node) while hotter data can be on 
>>> a
>>>    bigger cluster (less data/node)
>>>
>>>
>>> It does not mean that every table should be in its own cluster.
>>> But large ones can be moved to their own dedicated clusters (like those
>>> more than a few terabytes).
>>> And smaller ones can be clubbed together in one or few clusters.
>>>
>>> Please share any recommendations for the above from actual production
>>> experiences.
>>> Thanks for helping !
>>>
>>>

Re: One big giant cluster or several smaller ones?

Reply via email to