Even if we don't want to allow a default, we can keep the same CREATE INDEX syntax in place, and have a guardrail forcing (or not) the selection of an implementation, right? This would be no worse than the YAML option we already have for enabling 2i creation as a whole.
On Fri, May 12, 2023 at 12:28 PM Benedict <bened...@apache.org> wrote: > I’m not convinced a default index makes any sense, no. The trade-offs in a > distributed setting are much more pronounced. > > Indexes in a local-only RDBMS are much simpler affairs; the trade offs are > much more subtle than here. > > On 12 May 2023, at 18:24, Caleb Rackliffe <calebrackli...@gmail.com> > wrote: > > > > Now, giving this thread, there is pushback for a config to allow > default impl to change… but there is 0 pushback for new syntax to make this > explicit…. So maybe we should [POLL] for what syntax people want? > > I think the essential question is whether we want the concept of a default > index. If we do, we need to figure that out now. If we don't then a new > syntax that forces it becomes interesting. > > Given it seems most DBs have a default index (see Postgres, etc.), I tend > to lean toward having one, but that's me... > > On Fri, May 12, 2023 at 12:20 PM David Capwell <dcapw...@apple.com> wrote: > >> I really dislike the idea of the same CQL doing different things based upon >> a per-node configuration. >> >> >> I agree with Brandon that changing CQL behaviour like this based on node >> config is really not ideal. >> >> >> I am cool adding such a config, and also cool keeping CREATE INDEX >> disabled by default…. But would like to point out that we have many configs >> that impact CQL and they are almost always local configs… >> >> Is CREATE INDEX even allowed? This is a per node config. Right now you >> can block globally, enable on a single instance, create the index for your >> users, then revert the config change on the instance…. >> >> All guardrails that define what we can do are per node configs… >> >> Now, giving this thread, there is pushback for a config to allow default >> impl to change… but there is 0 pushback for new syntax to make this >> explicit…. So maybe we should [POLL] for what syntax people want? >> >> if we decide before the 5.0 release that we have enough information to >> change the default (#1), we can change it in a matter of minutes. >> >> >> I am strongly against this… SAI is new for 5.0 so should be disabled by >> default; else we disrespect the idea that new features are disabled by >> default. I am cool with our docs recommending if we do find its better in >> most cases, but we should not change the default in the same reason it >> lands in. >> >> On May 12, 2023, at 10:10 AM, Caleb Rackliffe <calebrackli...@gmail.com> >> wrote: >> >> I don't want to cut over for 5.0 either way. I was more contrasting a >> configurable cutover in 5.0 vs. a hard cutover later. >> >> On Fri, May 12, 2023 at 12:09 PM Benedict <bened...@apache.org> wrote: >> >>> If the performance characteristics are as clear cut as you think, then >>> maybe it will be an easy decision once the evidence is available for >>> everyone to consider? >>> >>> If not, then we probably can’t do the hard cutover and so the answer is >>> still pretty simple? >>> >>> On 12 May 2023, at 18:04, Caleb Rackliffe <calebrackli...@gmail.com> >>> wrote: >>> >>> >>> I don't particularly like the YAML solution either, but absent that, >>> we're back to fighting about whether we introduce entirely new syntax or >>> hard cut over to SAI at some point. >>> >>> We already have per-node configuration in the YAML that determines >>> whether or not we can create a 2i at all, right? >>> >>> What if we just do #2 and #3 and punt on everything else? >>> >>> On Fri, May 12, 2023 at 11:56 AM Benedict <bened...@apache.org> wrote: >>> >>>> A table is not a local concept at all, it has a global primary index - >>>> that’s the core idea of Cassandra. >>>> >>>> I agree with Brandon that changing CQL behaviour like this based on >>>> node config is really not ideal. New syntax is by far the simplest and >>>> safest solution to this IMO. It doesn’t have to use the word LOCAL, but I >>>> think that’s anyway an improvement, personally. >>>> >>>> In future we will hopefully offer GLOBAL indexes, and IMO it is better >>>> to reify the distinction in the syntax. >>>> >>>> On 12 May 2023, at 17:29, Caleb Rackliffe <calebrackli...@gmail.com> >>>> wrote: >>>> >>>> >>>> We don't need to know everything about SAI's performance profile to >>>> plan and execute some small, reasonable things now for 5.0. I'm going to >>>> try to summarize the least controversial package of ideas from the >>>> discussion above. I've left out creating any new syntax. For example, I >>>> think CREATE LOCAL INDEX, while explicit, is just not necessary. We >>>> don't use CREATE LOCAL TABLE, although it has the same locality as our >>>> indexes. >>>> >>>> Okay, so the proposal for 5.0... >>>> >>>> 1.) Add a YAML option that specifies a default implementation for CREATE >>>> INDEX, and make this the legacy 2i for now. No existing DDL breaks. We >>>> don't have to commit to the absolute superiority of SAI. >>>> 2.) Add USING...WITH... support to CREATE INDEX, so we don't have to >>>> go to market using CREATE CUSTOM INDEX, which feels...not so polished. >>>> (The backend for this already exists w/ CREATE CUSTOM INDEX.) >>>> 3.) Leave in place but deprecate (client warnings could work?) CREATE >>>> CUSTOM INDEX. Support the syntax for the foreseeable future. >>>> >>>> Can we live w/ this? >>>> >>>> I don't think any information about SAI we could possibly acquire >>>> before a 5.0 release would affect the reasonableness of this much. >>>> >>>> >>>> On Fri, May 12, 2023 at 10:54 AM Benedict <bened...@apache.org> wrote: >>>> >>>>> if we didn't have copious amounts of (not all public, I know, working >>>>> on it) evidence >>>>> >>>>> >>>>> If that’s the assumption on which this proposal is based, let’s >>>>> discuss the evidence base first, as given the fundamentally different way >>>>> they work (almost diametrically opposite), I would want to see a very high >>>>> quality of evidence to support the claim. >>>>> >>>>> I don’t think we can resolve this conversation effectively until this >>>>> question is settled. >>>>> >>>>> On 12 May 2023, at 16:19, Caleb Rackliffe <calebrackli...@gmail.com> >>>>> wrote: >>>>> >>>>> >>>>> > This creates huge headaches for everyone successfully using 2i today >>>>> though, and SAI *is not* guaranteed to perform as well or better - it has >>>>> a >>>>> very different performance profile. >>>>> >>>>> We wouldn't have even advanced it to this point if we didn't have >>>>> copious amounts of (not all public, I know, working on it) evidence it did >>>>> for the vast majority of workloads. Having said that, I don't strongly >>>>> agree that we should make it the default in 5.0, because performance isn't >>>>> the only concern. (correctness, DDL back-compat, which we've sort of >>>>> touched w/ the YAML default option, etc.) >>>>> >>>>> This conversation is now going in like 3 different directions, or at >>>>> least 3 different "packages" of ideas, so there isn't even a single thing >>>>> to vote on. Let me read through again and try to distill into something >>>>> that we might be able to do so with... >>>>> >>>>> On Fri, May 12, 2023 at 7:56 AM Aleksey Yeshchenko <alek...@apple.com> >>>>> wrote: >>>>> >>>>>> This. >>>>>> >>>>>> I would also consider adding CREATE LEGACY INDEX syntax as an alias >>>>>> for today’s CREATE INDEX, the latter to be deprecated and (in very >>>>>> distant >>>>>> future) removed. >>>>>> >>>>>> On 12 May 2023, at 13:14, Benedict <bened...@apache.org> wrote: >>>>>> >>>>>> This creates huge headaches for everyone successfully using 2i today >>>>>> though, and SAI *is not* guaranteed to perform as well or better - it >>>>>> has a >>>>>> very different performance profile. >>>>>> >>>>>> I think we should deprecate CREATE INDEX, and introduce new syntax >>>>>> CREATE LOCAL INDEX to make clear that this is not a global index, and >>>>>> that >>>>>> this should require the USING syntax to avoid this problem in future. >>>>>> >>>>>> We should report warnings to the client when CREATE INDEX is used, >>>>>> indicating it is deprecated. >>>>>> >>>>>> >>>>>> >>