> TLDR, based on availability concerns, skew concerns, operational
> concerns, and based on the fact that the new allocation algorithm can
> be configured fairly simply now, this is a proposal to go with 4 as the
> new default and the allocate_tokens_for_local_replication_factor set to
> 3.
While I (mostly) understand the maths behind using 4 vnodes as a default
(which really is a question of extreme availability), I don't think they
provide noticeable performance improvements over using 16, while 16 vnodes
will protect folks from imbalances. It is very hard to deal with unbalanced
cl
>
> We should be using the default value that benefits the most people, rather
> than an arbitrary compromise.
I'd caution we're talking about the default value *we believe* will benefit
the most people according to our respective understandings of C* usage.
Most clusters don't shrink, they stay
Hey all,
At some point not too long ago I spent some time trying to
make the token allocation algorithm the default.
I didn't foresee it, although it might be obvious for many of
you, but one corollary of the way the algorithm works (or more
precisely might not work) with multiple seeds or simult
So why even have virtual nodes at all, why not work on improving single
token approaches so that we can support cluster doubling, which IMO would
enable cassandra to more quickly scale for volatile loads?
It's my guess/understanding that vnodes eliminate the token rebalancing
that existed back in
On 1/31/20 9:58 AM, Dimitar Dimitrov wrote:
one corollary of the way the algorithm works (or more
precisely might not work) with multiple seeds or simultaneous
multi-node bootstraps or decommissions, is that a lot of dtests
start failing due to deterministic token conflicts. I wasn't
able to fix
"large/giant clusters and admins are the target audience for the value we
select"
There are reasons aside from massive scale to pick cassandra, but the
primary reason cassandra is selected technically is to support vertically
scaling to large clusters.
Why pick a value that once you reach scale y
edit: 4 is bad at small cluster sizes and could scare off adoption
On Fri, Jan 31, 2020 at 12:15 PM Carl Mueller
wrote:
> "large/giant clusters and admins are the target audience for the value we
> select"
>
> There are reasons aside from massive scale to pick cassandra, but the
> primary reason
I think that we might be bikeshedding this number a bit because it is easy
to debate and there is not yet one right answer. I hope we recognize either
choice (4 or 16) is fine in that users can always override us and we can
always change our minds later or better yet improve allocation so users
don
On Fri, Jan 31, 2020 at 11:25 AM Joseph Lynch wrote:
> I think that we might be bikeshedding this number a bit because it is easy
> to debate and there is not yet one right answer.
>
https://www.youtube.com/watch?v=v465T5u9UKo
I think Mick and Anthony make some valid operational and skew points for
smaller/starting clusters with 4 num_tokens. There’s an arbitrary line between
small and large clusters but I think most would agree that most clusters are on
the small to medium side. (A small nuance is afaict the probabil
+1 (non-binding)
I've briefly tested the build with Jepsen.
https://github.com/scalar-labs/scalar-jepsen
2020年1月31日(金) 13:37 Anthony Grasso :
> +1 (non-binding)
>
> On Fri, 31 Jan 2020 at 08:48, Joshua McKenzie
> wrote:
>
> > +1
> >
> > On Thu, Jan 30, 2020 at 4:31 PM Brandon Williams
> wrote:
12 matches
Mail list logo