I have read the thread but chose to reply to the top message...

I'm coming to this with the background of having worked with MySQL, where
both the storage engine and index implementation had many options, and
often of course some index types were only available in some engines.

I would humbly suggest:

1. What's up with naming anything "legacy". Calling the current index type
"2i" seems perfectly fine with me. From what I've heard it can work great
for many users?

2. It should be possible to always specify the index type explicitly. In
other words, it should be possible to CREATE CUSTOM INDEX ... USING "2i"
(if it isn't already)

2b) It should be possible to just say "SAI" or "SASIIndex", not the full
Java path.

3. It's a fair point that the "CUSTOM" word may make this sound a bit too
special... The simplest change IMO is to just make the CUSTOM work optional.

4. Benedict's point that a YAML option is per node is a good one... For
example, you wouldn't want some nodes to create a 2i index and other nodes
a SAI index for the same index.... That said, how many other YAML options
can you think of that would create total chaos if different nodes actually
had different values for them? For example what if a guardrail allowed some
action on some nodes but not others?  Maybe what we need is a jira ticket
to enforce that certain sections of the config must not differ?

5. That said, the default index type could also be a property of the
keyspace

6. MySQL allows the DBA to determine the default engine. This seems to work
well. If the user doesn't care, they don't care, if they do, they use the
explicit syntax.

henrik


On Wed, May 10, 2023 at 12:45 AM Caleb Rackliffe <calebrackli...@gmail.com>
wrote:

> Earlier today, Mick started a thread on the future of our index creation
> DDL on Slack:
>
> https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019
> <https://urldefense.com/v3/__https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019__;!!PbtH5S7Ebw!YuQzuQkxC0gmD9ofXEGoaEmVMwPwZ_ab8-B_PCfRfNsQtKIZDLOIuw38jnV1Vt8TqHXn-818hL-CoLbVJXBTCWgSxoE$>
>
> At the moment, there are two ways to create a secondary index.
>
> *1.) CREATE INDEX [IF NOT EXISTS] [name] ON <table> (<column>)*
>
> This creates an optionally named legacy 2i on the provided table and
> column.
>
>     ex. CREATE INDEX my_index ON kd.tbl(my_text_col)
>
> *2.) CREATE CUSTOM INDEX [IF NOT EXISTS] [name] ON <table> (<column>)
> USING <class|alias> [WITH OPTIONS = <options>]*
>
> This creates a secondary index on the provided table and column using the
> specified 2i implementation class and (optional) parameters.
>
>     ex. CREATE CUSTOM INDEX my_index ON ks.tbl(my_text_col) USING
> 'StorageAttachedIndex'
>
> (Note that the work on SAI added aliasing, so `StorageAttachedIndex` is
> shorthand for the fully-qualified class name, which is also valid.)
>
> So what is there to discuss?
>
> The concern Mick raised is...
>
> "...just folk continuing to use CREATE INDEX  because they think CREATE
> CUSTOM INDEX is advanced (or just don't know of it), and we leave users
> doing 2i (when they think they are, and/or we definitely want them to be,
> using SAI)"
>
> To paraphrase, we want people to use SAI once it's available where
> possible, and the default behavior of CREATE INDEX could be at odds w/
> that.
>
> The proposal we seem to have landed on is something like the following:
>
> For 5.0:
>
> 1.) Disable by default the creation of new legacy 2i via CREATE INDEX.
> 2.) Leave CREATE CUSTOM INDEX...USING... available by default.
>
> (Note: How this would interact w/ the existing secondary_indexes_enabled
> YAML options isn't clear yet.)
>
> Post-5.0:
>
> 1.) Deprecate and eventually remove SASI when SAI hits full feature parity
> w/ it.
> 2.) Replace both CREATE INDEX and CREATE CUSTOM INDEX w/ something of a
> hybrid between the two. For example, CREATE INDEX...USING...WITH. This
> would both be flexible enough to accommodate index implementation selection
> and prescriptive enough to force the user to make a decision (and wouldn't
> change the legacy behavior of the existing CREATE INDEX). In this world,
> creating a legacy 2i might look something like CREATE INDEX...USING
> `legacy`.
> 3.) Eventually deprecate CREATE CUSTOM INDEX...USING.
>
> Eventually we would have a single enabled DDL statement for index creation
> that would be minimal but also explicit/able to handle some evolution.
>
> What does everyone think?
>


-- 

Henrik Ingo

c. +358 40 569 7354

w. www.datastax.com

<https://www.facebook.com/datastax>  <https://twitter.com/datastax>
<https://www.linkedin.com/company/datastax/>  <https://github.com/datastax/>

Reply via email to