You're correct about the DC migration being the safest/best option.

Another option would be to change the token count by smaller increments to
not overload the old nodes, but this is an incredibly slow and painful
process that you almost certainly shouldn't do.

One of the biggest downsides to higher token counts is repair cost.  The
subrange repairs managed by Reaper actually dampen the impact a lot here -
pre-Reaper, especially with Cassandra <=3.0, repairs could lead to sstable
explosion and GC load that could basically collapse the cluster.  With
Cassandra 3.11+ and Reaper the impact is way less severe.  It'll just make
repairs slower.  In my experience a few years ago, going from 256 to 16 cut
repair times in half.

There's some availability implications to higher num_token values,
partly mitigated by using NetworktopologyStrategy:
https://github.com/jolynch/python_performance_toolkit/raw/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf

Node-joining speed is kind of a weird mix of benefit/deficit.  As far as I
can tell each token range will be streamed in and handled by a thread, so
if you have 32 cores and at least 32 (probably 96, if rf=3) nodes already
in the cluster, num_token=32 will be faster than num_token=16.  Very
diminishing returns much above that though, since past that point a large
enough cluster to benefit is also going to see more pain (node rejoin
times, repair times, etc) from the larger number of token ranges.



On Tue, Jul 22, 2025 at 6:39 PM Isaeed Mohanna <isa...@xsense.co> wrote:

> Hi
>
> In a 4 node cluster with replication factor 3 we have been using the
> default num_tokens=256 from Cassandra 3. We have upgraded to Cassandra 4.1
> last year and planning on upgrading to 5 soon and we see that the
> recommendation has been changed to 16.
>
> An attempt to do a rolling build of the cluster with num_token=16 has
> failed since once I take down one of the old nodes most of the data arrives
> into the other old nodes which then become strained and at high risk of
> crashing.
>
> Each of the old nodes stores data of ~ 700GB.
>
> To my understanding the safest option is to setup a second DC and join it
> to the cluster then decommission the old one but I would like to understand
> how bad is it to keep the num_tokens at 256 and how will it affect
> practically in a day to day use in 4.1 and Cassandra 5.
>
> Thanks for any help,
>
> Isaeed Mohanna
>
>
>
>
>
> *Best regards,*
>
> *Isaeed Mohanna, Software Development Manager   *  *Intelligent Cold
> Chain Monitoring*
>
> *E-Mail:*  isa...@xsense.co | *Web:* www.xsense.co
> <https://urldefense.com/v3/__http://www.xsense.co/__;!!A7vfX_LdLUs!o8RL6kYRGTJ6JvKF9BNTOww3oA6X_PZv357unlzj7Huo1dJaYFtO5r5s8ZklfpF_HuZXeQc1be30d4GfIw$>
>
>
>
> If you have further questions please don’t hesitate to contact me.
>
>
>

-- 
This email, including its contents and any attachment(s), may contain 
confidential and/or proprietary information and is solely for the review 
and use of the intended recipient(s). If you have received this email in 
error, please notify the sender and permanently delete this email, its 
content, and any attachment(s).  Any disclosure, copying, or taking of any 
action in reliance on an email received in error is strictly prohibited.

Reply via email to