I don’t have a strong opinion about CEP-7 taking a hard dependency on any new 
CQL CEP, particularly from a point of view of first landing in the codebase.


From: Henrik Ingo <henrik.i...@datastax.com>
Date: Monday, 7 February 2022 at 12:03
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [DISCUSS] CEP-7 Storage Attached Index
Thanks Benjamin for reviewing and raising this.

While I don't speak for the CEP authors, just some thoughts from me:

On Mon, Feb 7, 2022 at 11:18 AM Benjamin Lerer 
<ble...@apache.org<mailto:ble...@apache.org>> wrote:
I would like to raise 2 points regarding the current CEP proposal:

1. There are mention of some target versions and of the removal of SASI

At this point, we have not agreed on any version numbers and I do not feel that 
removing SASI should be part of the proposal for now.
It seems to me that we should see first the adoption surrounding SAI before 
talking about deprecating other solutions.


This seems rather uncontroversial. I think the CEP template and previous CEPs 
invite  the discussion on whether the new feature will or may replace an 
existing feature. But at the same time that's of course out of scope for the 
work at hand. I have no opinion one way or the other myself.


2. OR queries

It is unclear to me if the proposal is about adding OR support only for SAI 
index or for other types of queries too.
In the past, we had the nasty habit for CQL to provide only partialially 
implemented features which resulted in a bad user experience.
Some examples are:
* LIKE restrictions which were introduced for the need of SASI and were not 
never supported for other type of queries
* IS NOT NULL restrictions for MATERIALIZED VIEWS that are not supported 
elsewhere
* != operator only supported for conditional inserts or updates
And there are unfortunately many more.

We are currenlty slowly trying to fix those issue and make CQL a more mature 
language. By consequence, I would like that we change our way of doing things. 
If we introduce support for OR it should also cover all the other type of 
queries and be fully tested.
I also believe that it is a feature that due to its complexity fully deserves 
its own CEP.


The current code that would be submitted for review after the CEP is adopted, 
contains OR support beyond just SAI indexes. An initial implementation first 
targeted only such queries where all columns in a WHERE clause using OR needed 
to be backed by an SAI index. This was since extended to also support ALLOW 
FILTERING mode as well as OR with clustering key columns. The current 
implementation is by no means perfect as a general purpose OR support, the 
focus all the time was on implementing OR support in SAI. I'll leave it to 
others to enumerate exactly the limitations of the current implementation.

Seeing that also Benedict supports your point of view, I would steer the 
conversation more into a project management perspective:
* How can we advance CEP-7 so that the bulk of the SAI code can still be added 
to Cassandra, so that  users can benefit from this new index type, albeit 
without OR?
* This is also an important question from the point of view that this is a 
large block of code that will inevitably diverged if it's not in trunk. Also, 
merging it to trunk will allow future enhancements, including the OR syntax 
btw, to happen against trunk (aka upstream first).
* Since OR support nevertheless is a feature of SAI, it needs to be at least 
unit tested, but ideally even would be exposed so that it is possible to test 
on the CQL level. Is there some mechanism such as experimental flags, which 
would allow the SAI-only OR support to be merged into trunk, while a separate 
CEP is focused on implementing "proper" general purpose OR support? I should 
note that there is no guarantee that the OR CEP would be implemented in time 
for the next release. So the answer to this point needs to be something that 
doesn't violate the desire for good user experience.

henrik


Reply via email to