the security.
>>
>> ________
>> From: Henrik Ingo
>> Sent: Wednesday, May 17, 2023 22:32
>> To: dev@cassandra.apache.org
>> Subject: Re: [DISCUSS] The future of CREATE INDEX
>>
>> NetApp Security WARNING: This is an
ht compromise the security.
>
>
> From: Henrik Ingo
> Sent: Wednesday, May 17, 2023 22:32
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS] The future of CREATE INDEX
>
> NetApp Security WARNING: This is an external email. Do not clic
EP-24 because different
local configurations might compromise the security.
From: Henrik Ingo
Sent: Wednesday, May 17, 2023 22:32
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] The future of CREATE INDEX
NetApp Security WARNING: This is an external
> 1. What's up with naming anything "legacy". Calling the current index
type "2i" seems perfectly fine with me. From what I've heard it can work
great for many users?
We can give the existing default secondary index any public-facing name we
like, but "2i" is too broad. It just stands for "seconda
I have read the thread but chose to reply to the top message...
I'm coming to this with the background of having worked with MySQL, where
both the storage engine and index implementation had many options, and
often of course some index types were only available in some engines.
I would humbly sug
I might as well weigh in...
[POLL] Centralize existing syntax or create new syntax?
1.) CREATE INDEX ... USING ... WITH OPTIONS...
(I think the more important protection for users WRT local indexes should
come in the form of a guardrail prohibiting scatter/gather queries against
them.)
[POLL] S
>
> [POLL] Centralize existing syntax or create new syntax?
1.) CREATE INDEX ... USING WITH OPTIONS...
and I think we should keep CREATE CUSTOM INDEX
[POLL] Should there be a default? (YES/NO)
of course YES
[POLL] What do do with the default?
4.) YAML config/guardrail to require ind
On Fri, May 12, 2023 at 1:39 PM Caleb Rackliffe
wrote:
> [POLL] Centralize existing syntax or create new syntax?
>
1 (Existing)
[POLL] Should there be a default? (YES/NO)
>
YES
> [POLL] What do do with the default?
>
1 (Default SAI)
> On May 12, 2023, at 11:36 AM, Caleb Rackliffe
> wrote:
>
> [POLL] Centralize existing syntax or create new syntax?
>
> 1.) CREATE INDEX ... USING WITH OPTIONS...
> 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds
> LOCAL keyword for clarity and separation from
> [POLL] Centralize existing syntax or create new syntax?
1.) CREATE INDEX ... USING WITH OPTIONS...
> [POLL] Should there be a default? (YES/NO)
Yes
> [POLL] What do do with the default?
3.) YAML config to override default index (legacy 2i remains the default)
4.) YAML config/guardrail
1
Yes
4
On Mon, May 15, 2023 at 3:00 AM Benedict wrote:
> 3: CREATE INDEX (Otherwise 2)
> No
> If configurable, should be a distributed configuration. This is very
> different to other local configurations, as the 2i selected has semantic
> implications, not just performance (and the perf imp
3: CREATE INDEX (Otherwise 2)NoIf configurable, should be a distributed configuration. This is very different to other local configurations, as the 2i selected has semantic implications, not just performance (and the perf implications are also much greater)On 15 May 2023, at 10:45, Mike Adamson w
>
> [POLL] Centralize existing syntax or create new syntax?
>
> 1.) CREATE INDEX ... USING WITH OPTIONS...
> 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but
> adds LOCAL keyword for clarity and separation from future GLOBAL indexes)
>
1.) CREATE INDEX ... USING WITH
[POLL] Centralize existing syntax or create new syntax?
>
> 1.) CREATE INDEX ... USING WITH OPTIONS...
> 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but
> adds LOCAL keyword for clarity and separation from future GLOBAL indexes)
>
(1) CREATE INDEX …
> [POLL] Should t
I don’t think there’s going to be any real support for doing it in 5.0 anyway at this point.On May 12, 2023, at 1:48 PM, Benedict wrote:Given we have no data in front of us to make a decision regarding switching defaults, I don’t think it is suitable to include that option in this poll. In fact,
Given we have no data in front of us to make a decision regarding switching defaults, I don’t think it is suitable to include that option in this poll. In fact, until we have sufficient data to discuss that I’m going to put a hard veto on that on technical grounds.On 12 May 2023, at 19:41, Caleb Ra
> [POLL] Centralize existing syntax or create new syntax?
1.) CREATE INDEX ... USING WITH OPTIONS...
> [POLL] Should there be a default? (YES/NO)
YES
> [POLL] What do do with the default?
3.) YAML config to override default index (legacy 2i remains the default)
DESCRIBE should always s
...and to clarify, answers should be what you'd like to see for 5.0
specifically
On Fri, May 12, 2023 at 1:36 PM Caleb Rackliffe
wrote:
> [POLL] Centralize existing syntax or create new syntax?
>
> 1.) CREATE INDEX ... USING WITH OPTIONS...
> 2.) CREATE LOCAL INDEX ... USING ... WITH OPTION
[POLL] Centralize existing syntax or create new syntax?
1.) CREATE INDEX ... USING WITH OPTIONS...
2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS... (same as 1, but adds
LOCAL keyword for clarity and separation from future GLOBAL indexes)
(In both cases, we deprecate w/ client warnings C
But then we have to reconsider the existing syntax, or do we want LOCAL to be the default?We should be planning our language evolution along with our feature evolution.On 12 May 2023, at 19:28, Caleb Rackliffe wrote:If at some point in the glorious future we have global indexes, I'm sure we can a
If at some point in the glorious future we have global indexes, I'm sure we
can add GLOBAL to the syntax...sry, working on an ugly poll...
On Fri, May 12, 2023 at 1:24 PM Benedict wrote:
> If folk should be reading up on the index type, doesn’t that conflict with
> your support of a default?
>
>
If folk should be reading up on the index type, doesn’t that conflict with your
support of a default?
Should there be different global and local defaults, once we have global
indexes, or should we always default to a local index? Or a global one?
> On 12 May 2023, at 18:39, Mick Semb Wever wro
>
>
> Given it seems most DBs have a default index (see Postgres, etc.), I tend
> to lean toward having one, but that's me...
>
I'm for it too. Would be nice to enforce the setting is globally uniform
to avoid the per-node problem. Or add a keyspace option.
For users replaying <5 DDLs this woul
There remains the question of what the new syntax is - whether it augments CREATE INDEX to replace CREATE CUSTOM INDEX or if we introduce new syntax because we think it’s clearer.I can accept settling for modifying CREATE INDEX … USING, but I maintain that CREATE LOCAL INDEX is betterOn 12 May 202
Even if we don't want to allow a default, we can keep the same CREATE INDEX
syntax in place, and have a guardrail forcing (or not) the selection of an
implementation, right? This would be no worse than the YAML option we
already have for enabling 2i creation as a whole.
On Fri, May 12, 2023 at 12:
I’m not convinced a default index makes any sense, no. The trade-offs in a distributed setting are much more pronounced.Indexes in a local-only RDBMS are much simpler affairs; the trade offs are much more subtle than here. On 12 May 2023, at 18:24, Caleb Rackliffe wrote:> Now, giving this thread,
> Now, giving this thread, there is pushback for a config to allow default
impl to change… but there is 0 pushback for new syntax to make this
explicit…. So maybe we should [POLL] for what syntax people want?
I think the essential question is whether we want the concept of a default
index. If we d
I still prefer introducing CREATE LOCAL INDEX, to help users understand the semantics of the index they’re creating.I think it will in future potentially be quite confusing to be able to create global and local indexes using the same DDL statement.But, depending on appetite, that could plausibly be
> I really dislike the idea of the same CQL doing different things based upon a
> per-node configuration.
> I agree with Brandon that changing CQL behaviour like this based on node
> config is really not ideal.
I am cool adding such a config, and also cool keeping CREATE INDEX disabled by
def
So the weakest version of the plan that actually accomplishes something
useful for 5.0:
1.) Just leave the CREATE INDEX default alone for now. Hard switch the
default after 5.0.
2.) Add USING...WITH... support to CREATE INDEX, so we don't have to go to
market using CREATE CUSTOM INDEX, which feels
I don't want to cut over for 5.0 either way. I was more contrasting a
configurable cutover in 5.0 vs. a hard cutover later.
On Fri, May 12, 2023 at 12:09 PM Benedict wrote:
> If the performance characteristics are as clear cut as you think, then
> maybe it will be an easy decision once the evide
If the performance characteristics are as clear cut as you think, then maybe it will be an easy decision once the evidence is available for everyone to consider?If not, then we probably can’t do the hard cutover and so the answer is still pretty simple? On 12 May 2023, at 18:04, Caleb Rackliffe wr
I don't particularly like the YAML solution either, but absent that, we're
back to fighting about whether we introduce entirely new syntax or hard cut
over to SAI at some point.
We already have per-node configuration in the YAML that determines whether
or not we can create a 2i at all, right?
Wha
A table is not a local concept at all, it has a global primary index - that’s the core idea of Cassandra.I agree with Brandon that changing CQL behaviour like this based on node config is really not ideal. New syntax is by far the simplest and safest solution to this IMO. It doesn’t have to use the
On Fri, May 12, 2023 at 11:29 AM Caleb Rackliffe
wrote:
>
> Okay, so the proposal for 5.0...
>
> 1.) Add a YAML option that specifies a default implementation for CREATE
> INDEX, and make this the legacy 2i for now. No existing DDL breaks. We don't
> have to commit to the absolute superiority of
...and if we decide before the 5.0 release that we have enough information
to change the default (#1), we can change it in a matter of minutes.
On Fri, May 12, 2023 at 11:28 AM Caleb Rackliffe
wrote:
> We don't need to know everything about SAI's performance profile to plan
> and execute some sm
We don't need to know everything about SAI's performance profile to plan
and execute some small, reasonable things now for 5.0. I'm going to try to
summarize the least controversial package of ideas from the discussion
above. I've left out creating any new syntax. For example, I think CREATE
LOCAL
if we didn't have copious amounts of (not all public, I know, working on it) evidenceIf that’s the assumption on which this proposal is based, let’s discuss the evidence base first, as given the fundamentally different way they work (almost diametrically opposite), I would want to see a very high q
> This creates huge headaches for everyone successfully using 2i today
though, and SAI *is not* guaranteed to perform as well or better - it has a
very different performance profile.
We wouldn't have even advanced it to this point if we didn't have copious
amounts of (not all public, I know, worki
This.
I would also consider adding CREATE LEGACY INDEX syntax as an alias for today’s
CREATE INDEX, the latter to be deprecated and (in very distant future) removed.
> On 12 May 2023, at 13:14, Benedict wrote:
>
> This creates huge headaches for everyone successfully using 2i today though,
>
This creates huge headaches for everyone successfully using 2i today though, and SAI *is not* guaranteed to perform as well or better - it has a very different performance profile.I think we should deprecate CREATE INDEX, and introduce new syntax CREATE LOCAL INDEX to make clear that this is not a
On Thu, 11 May 2023 at 05:27, Patrick McFadin wrote:
> Having pulled a lot of developers out of the 2i fire,
>
Yes. I'm keen not to leave 2i as the default once SAI lands. Otherwise I
agree with the deprecated first principle, but 2i is just too problematic.
Just having no default in 5.0, forc
There will be a LOT of content around using SAI in 5.0.
CCing marketing ML
On Wed, May 10, 2023 at 8:38 PM Jeff Jirsa wrote:
> Changes like this always scare me, but the benefits probably outweigh the
> risks. Probably obviously to whoever implements but please make sure if
> this happens is su
Changes like this always scare me, but the benefits probably outweigh the
risks. Probably obviously to whoever implements but please make sure if
this happens is super visible in both NEWS and simultaneously updates the
to-string / to-cql representation of the schema in cqlsh / drivers /
snapshots
Having pulled a lot of developers out of the 2i fire, I would love it if
defaults got a bit more sane. Adding USING...WITH... on CREATE INDEX
seems like the right move for most developers that don't read docs and
assume behavior.
As much as I hate that 2i would be the configured default, I get it.
> Having to revert to CREATE CUSTOM INDEX sounds pretty awful, so I'd prefer
> allowing USING...WITH... for CREATE INDEX
I have 0 issues with a new syntax to make this more clear
> just deprecating CREATE CUSTOM INDEX (at least after 5.0), but that's more or
> less what my original proposal was
tl;dr If you take my original proposal and change only the fact that CREATE
INDEX retains a configurable default, I think we get to the same place?
(Then it's just a matter of what we do in 5.0 vs. after 5.0...)
On Wed, May 10, 2023 at 11:00 AM Caleb Rackliffe
wrote:
> I see a broad desire here
> We could introduce new syntax that properly appreciates there’s no
default index, perhaps CREATE LOCAL [type] INDEX? To also make clear that
these indexes involve a partition key or scatter gather
I think this is something we should handle in guardrails space on the query
side for all indexes. S
I see a broad desire here to have a configurable (YAML) default
implementation for CREATE INDEX. I'm not strongly opposed to that, as the
concept of a default index implementation is pretty standard for most DBMS
(see Postgres, etc.). However, keep in mind that if we do that, we still
need to eithe
I’m not convinced by the changing defaults argument here. The characteristics of the two index types are very different, and users with scripts that make indexes today shouldn’t have their behaviour change.We could introduce new syntax that properly appreciates there’s no default index, perhaps CRE
+1 , as we must Improve the image of your own default indexing ability.
and As for *CREATE CUSTOM INDEX *, should we just left as it is and we can
disable the ability for create SAI through *CREATE CUSTOM INDEX* in some
version after 5.0?
for as I know there may be users using this as a plugin-
I agree. 5.0 is a major release and provides an opportunity to switch defaults.
> On May 9, 2023, at 7:00 PM, Jonathan Ellis wrote:
>
> +1 for this, especially in the long term. CREATE INDEX should do the right
> thing for most people without requiring extra ceremony.
>
> On Tue, May 9, 2023
+1 for this, especially in the long term. CREATE INDEX should do the right
thing for most people without requiring extra ceremony.
On Tue, May 9, 2023 at 5:20 PM Jeremiah D Jordan
wrote:
> If the consensus is that SAI is the right default index, then we should
> just change CREATE INDEX to be S
> If we assume SAI is what we should use by default for the cluster, would it
> make sense to allow
>
> CREATE INDEX [IF NOT EXISTS] [name] ON ()
>
> But use a new yaml config that switches from legacy to SAI?
>
> default_2i_impl: sai
>
> For 5.0 we can default to “legacy” (new features disab
If the consensus is that SAI is the right default index, then we should just
change CREATE INDEX to be SAI, and legacy 2i to be a CUSTOM INDEX.
> On May 9, 2023, at 4:44 PM, Caleb Rackliffe wrote:
>
> Earlier today, Mick started a thread on the future of our index creation DDL
> on Slack:
>
If we assume SAI is what we should use by default for the cluster, would it
make sense to allow
CREATE INDEX [IF NOT EXISTS] [name] ON ()
But use a new yaml config that switches from legacy to SAI?
default_2i_impl: sai
For 5.0 we can default to “legacy” (new features disabled by default), but
Earlier today, Mick started a thread on the future of our index creation
DDL on Slack:
https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019
At the moment, there are two ways to create a secondary index.
*1.) CREATE INDEX [IF NOT EXISTS] [name] ON ()*
This creates an optionally name
57 matches
Mail list logo