Only that it makes it easier to spin up a cluster. I'm for removing it entirely as well, however I think we should keep it around at least until the next major just as a safety precaution until the algorithm is properly battle tested.
This is not a strongly held opinion though, I'm just foreseeing the "new defaults don't work for my edge case" problem. On Sun., 23 Sep. 2018, 04:12 Jonathan Haddad, <j...@jonhaddad.com> wrote: > Is there a use case for random allocation? How does it help with testing? I > can’t see a reason to keep it around. > > On Sat, Sep 22, 2018 at 3:06 AM kurt greaves <k...@instaclustr.com> wrote: > > > +1. I've been making a case for this for some time now, and was actually > a > > focus of my talk last week. I'd be very happy to get this into 4.0. > > > > We've tested various num_tokens with the algorithm on various sized > > clusters and we've found that typically 16 works best. With lower numbers > > we found that balance is good initially but as a cluster gets larger you > > have some problems. E.g We saw that on a 60 node cluster with 8 tokens > per > > node we were seeing a difference of 22% in token ownership, but on a <=12 > > node cluster a difference of only 12%. 16 tokens on the other hand wasn't > > perfect but generally gave a better balance regardless of cluster size at > > least up to 100 nodes. TBH we should probably do some proper testing and > > record all the results for this before we pick a default (I'm happy to do > > this - think we can use the original testing script for this). > > > > But anyway, I'd say Jon is on the right track. Personally how I'd like to > > see it is that we: > > > > 1. Change allocate_tokens_for_keyspace to allocate_tokens_for_rf in > the > > same way that DSE does it. Allowing a user to specify a RF to allocate > > from, and allowing multiple DC's. > > 2. Add a new boolean property random_token_allocation, defaults to > > false. > > 3. Make allocate_tokens_for_rf default to *unset**. > > 4. Make allocate_tokens_for_rf *required*** if num_tokens > 1 and > > random_token_allocation != true. > > 5. Default num_tokens to 16 (or whatever we find appropriate) > > > > * I think setting a default is asking for trouble. When people are going > to > > add new DC's/nodes we don't want to risk them adding a node with the > wrong > > RF. I think it's safe to say that a user should have to think about this > > before they spin up their cluster. > > ** Following above, it should be required to be set so that we don't have > > people accidentally using random allocation. I think we should really be > > aiming to get rid of random allocation completely, but provide a new > > property to enable it for backwards compatibility (also for testing). > > > > It's worth noting that a smaller number of tokens *theoretically* > decreases > > the time for replacement/rebuild, so if we're considering QUORUM > > availability with vnodes there's an argument against having a very low > > num_tokens. I think it's better to utilise NTS and racks to reduce the > > chance of a QUORUM outage over banking on having a lower number of > tokens, > > as with just a low number of tokens unless you go all the way to 1 you > are > > just relying on luck that 2 nodes don't overlap. Guess what I'm saying is > > that I think we should be choosing a num_tokens that gives the best > > distribution for most cluster sizes rather than choosing one that > > "decreases" the probability of an outage. > > > > Also I think we should continue using CASSANDRA-13701 to track this. TBH > I > > think in general we should be a bit better at searching for and using > > existing tickets... > > > > On Sat, 22 Sep 2018 at 18:13, Stefan Podkowinski <s...@apache.org> > wrote: > > > > > There already have been some discussions on this here: > > > https://issues.apache.org/jira/browse/CASSANDRA-13701 > > > > > > The mentioned blocker there on the token allocation shouldn't exist > > > anymore. Although it would be good to get more feedback on it, in case > > > we want to enable it by default, along with new defaults for number of > > > tokens. > > > > > > > > > On 22.09.18 06:30, Dinesh Joshi wrote: > > > > Jon, thanks for starting this thread! > > > > > > > > I have created CASSANDRA-14784 to track this. > > > > > > > > Dinesh > > > > > > > >> On Sep 21, 2018, at 9:18 PM, Sankalp Kohli <kohlisank...@gmail.com> > > > wrote: > > > >> > > > >> Putting it on JIRA is to make sure someone is assigned to it and it > is > > > tracked. Changes should be discussed over ML like you are saying. > > > >> > > > >> On Sep 21, 2018, at 21:02, Jonathan Haddad <j...@jonhaddad.com> > wrote: > > > >> > > > >>>> We should create a JIRA to find what other defaults we need > revisit. > > > >>> Changing a default is a pretty big deal, I think we should discuss > > any > > > >>> changes to defaults here on the ML before moving it into JIRA. > It's > > > nice > > > >>> to get a bit more discussion around the change than what happens in > > > JIRA. > > > >>> > > > >>> We (TLP) did some testing on 4 tokens and found it to work > > surprisingly > > > >>> well. It wasn't particularly formal, but we verified the load > stays > > > >>> pretty even with only 4 tokens as we added nodes to the cluster. > > > Higher > > > >>> token count hurts availability by increasing the number of nodes > any > > > given > > > >>> node is a neighbor with, meaning any 2 nodes that fail have an > > > increased > > > >>> chance of downtime when using QUORUM. In addition, with the recent > > > >>> streaming optimization it seems the token counts will give a > greater > > > chance > > > >>> of a node streaming entire sstables (with LCS), meaning we'll do a > > > better > > > >>> job with node density out of the box. > > > >>> > > > >>> Next week I can try to put together something a little more > > convincing. > > > >>> Weekend time. > > > >>> > > > >>> Jon > > > >>> > > > >>> > > > >>> On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli < > > kohlisank...@gmail.com> > > > >>> wrote: > > > >>> > > > >>>> +1 to lowering it. > > > >>>> Thanks Jon for starting this.We should create a JIRA to find what > > > other > > > >>>> defaults we need revisit. (Please keep this discussion for > "default > > > token" > > > >>>> only. ) > > > >>>> > > > >>>>> On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa <jji...@gmail.com> > > wrote: > > > >>>>> > > > >>>>> Also agree it should be lowered, but definitely not to 1, and > > > probably > > > >>>>> something closer to 32 than 4. > > > >>>>> > > > >>>>> -- > > > >>>>> Jeff Jirsa > > > >>>>> > > > >>>>> > > > >>>>>> On Sep 21, 2018, at 8:24 PM, Jeremy Hanna < > > > jeremy.hanna1...@gmail.com> > > > >>>>> wrote: > > > >>>>>> I agree that it should be lowered. What I’ve seen debated a bit > in > > > the > > > >>>>> past is the number but I don’t think anyone thinks that it should > > > remain > > > >>>>> 256. > > > >>>>>>> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad < > j...@jonhaddad.com> > > > >>>> wrote: > > > >>>>>>> One thing that's really, really bothered me for a while is how > we > > > >>>>> default > > > >>>>>>> to 256 tokens still. There's no experienced operator that > leaves > > > it > > > >>>> as > > > >>>>> is > > > >>>>>>> at this point, meaning the only people using 256 are the poor > > folks > > > >>>> that > > > >>>>>>> just got started using C*. I've worked with over a hundred > > > clusters > > > >>>> in > > > >>>>> the > > > >>>>>>> last couple years, and I think I only worked with one that had > > > lowered > > > >>>>> it > > > >>>>>>> to something else. > > > >>>>>>> > > > >>>>>>> I think it's time we changed the default to 4 (or 8, up for > > > debate). > > > >>>>>>> > > > >>>>>>> To improve the behavior, we need to change a couple other > things. > > > The > > > >>>>>>> allocate_tokens_for_keyspace setting is... odd. It requires > you > > > have > > > >>>> a > > > >>>>>>> keyspace already created, which doesn't help on new clusters. > > What > > > >>>> I'd > > > >>>>>>> like to do is add a new setting, allocate_tokens_for_rf, and > set > > > it to > > > >>>>> 3 by > > > >>>>>>> default. > > > >>>>>>> > > > >>>>>>> To handle clusters that are already using 256 tokens, we could > > > prevent > > > >>>>> the > > > >>>>>>> new node from joining unless a -D flag is set to explicitly > allow > > > >>>>>>> imbalanced tokens. > > > >>>>>>> > > > >>>>>>> We've agreed to a trunk freeze, but I feel like this is > important > > > >>>> enough > > > >>>>>>> (and pretty trivial) to do now. I'd also personally > characterize > > > this > > > >>>>> as a > > > >>>>>>> bug fix since 256 is horribly broken when the cluster gets to > any > > > >>>>>>> reasonable size, but maybe I'm alone there. > > > >>>>>>> > > > >>>>>>> I honestly can't think of a use case where random tokens is a > > good > > > >>>>> choice > > > >>>>>>> anymore, so I'd be fine / ecstatic with removing it completely > > and > > > >>>>>>> requiring either allocate_tokens_for_keyspace (for existing > > > clusters) > > > >>>>>>> or allocate_tokens_for_rf > > > >>>>>>> to be set. > > > >>>>>>> > > > >>>>>>> Thoughts? Objections? > > > >>>>>>> -- > > > >>>>>>> Jon Haddad > > > >>>>>>> http://www.rustyrazorblade.com > > > >>>>>>> twitter: rustyrazorblade > > > >>>>>> > > > --------------------------------------------------------------------- > > > >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org > > > >>>>>> > > > >>>>> > > --------------------------------------------------------------------- > > > >>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org > > > >>>>> > > > >>>>> > > > >>> > > > >>> -- > > > >>> Jon Haddad > > > >>> http://www.rustyrazorblade.com > > > >>> twitter: rustyrazorblade > > > >> > --------------------------------------------------------------------- > > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > > > >> > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > > > > > -- > Jon Haddad > http://www.rustyrazorblade.com > twitter: rustyrazorblade >