> We don't need a whole "codec framework" for V1, but we're still embedding 
> some versioning information in the column index on-disk structures, right?

I’m not sure why we would want to pull the versioning code only to have to put 
it back in as soon as we need to change the on-disk format. We also need to 
consider whether the legacy format used by DSE is supported in OSS. I’m not 
sure of the policy on this although I strongly suspect that the answer is that 
it won’t be supported. Either way, it would seem to be a lot of work to pull 
the versioning code out at this point since it formed part of a major refactor 
of the SAI framework and plumbing.

MikeA

> On 11 Feb 2022, at 18:47, Caleb Rackliffe <calebrackli...@gmail.com> wrote:
> 
> Just finished reading the latest version of the CEP. Here are my thoughts:
> 
> - We've already talked about OR queries, so I won't rehash that, but 
> tokenization support seems like it might be another one of those places where 
> we can cut scope if we want to get V1 out the door. It shouldn't be that hard 
> to detangle from the rest of the code.
> - We mention the JMX metric ecosystem in the CEP, but not the related virtual 
> tables. This isn't a big issue, and doesn't mean we need to change the CEP, 
> but it might be helpful for those not familiar with the existing prototype to 
> know they exist :)
> - It's probably below the line for CEP discussion, but the text and numeric 
> index formats will probably change over time. We don't need a whole "codec 
> framework" for V1, but we're still embedding some versioning information in 
> the column index on-disk structures, right?
> 
> To offset my obvious partiality around this CEP, I've already made an effort 
> to raise some of the issues that may come up to challenge us from a macro 
> perspective. It seems like the prevailing opinion here is that they are 
> either surmountable or simply basic conceptual difficulties w/ distributed 
> secondary indexing.
> 
> tl;dr I'm +1 on bringing this to a vote and starting to put together all the 
> pieces for CASSANDRA-16052 
> <https://issues.apache.org/jira/browse/CASSANDRA-16052> :)
> 
> On Thu, Feb 10, 2022 at 11:26 AM Mike Adamson <madam...@datastax.com 
> <mailto:madam...@datastax.com>> wrote:
> > I'd be interested to hear from Mike/Jason on the OR support topic, of 
> > course.
> 
> The support for OR within SAI is fairly minimal and will not work without the 
> non-SAI changes needed. Since the non-SAI OR changes are extensive it would 
> be better to bring those in under their own CEP. 
> 
> I’d leave the decision of whether to put the rest of SAI behind an 
> experimental flag to others. My preference would be to not do so because the 
> non-OR implementation has been tested and used on production for over a year 
> now.
> 
> MikeA
> 
>> On 9 Feb 2022, at 13:06, bened...@apache.org <mailto:bened...@apache.org> 
>> wrote:
>> 
>> > Is there some mechanism such as experimental flags, which would allow the 
>> > SAI-only OR support to be merged into trunk
>>  
>> FWIW, I’m OK with this merging to trunk, either hidden behind a CI-only flag 
>> or exposed to the user via some experimental flag (and a suitable NEWS.txt). 
>> We’ve discussed the need to periodically merge feature branches with trunk 
>> before they are complete. If the work is logically complete for SAI, and 
>> we’re only pending work to make OR consistent between SAI and non-SAI 
>> queries, I think that more than meets this criterion.
>>  
>>  
>> From: Henrik Ingo <henrik.i...@datastax.com 
>> <mailto:henrik.i...@datastax.com>>
>> Date: Monday, 7 February 2022 at 12:03
>> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> 
>> <dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>>
>> Subject: Re: [DISCUSS] CEP-7 Storage Attached Index
>> 
>> Thanks Benjamin for reviewing and raising this.
>>  
>> While I don't speak for the CEP authors, just some thoughts from me:
>>  
>> On Mon, Feb 7, 2022 at 11:18 AM Benjamin Lerer <ble...@apache.org 
>> <mailto:ble...@apache.org>> wrote:
>> I would like to raise 2 points regarding the current CEP proposal:
>>  
>> 1. There are mention of some target versions and of the removal of SASI 
>>  
>> At this point, we have not agreed on any version numbers and I do not feel 
>> that removing SASI should be part of the proposal for now.
>> It seems to me that we should see first the adoption surrounding SAI before 
>> talking about deprecating other solutions.
>>  
>>  
>> This seems rather uncontroversial. I think the CEP template and previous 
>> CEPs invite  the discussion on whether the new feature will or may replace 
>> an existing feature. But at the same time that's of course out of scope for 
>> the work at hand. I have no opinion one way or the other myself.
>>  
>>  
>> 2. OR queries
>>  
>> It is unclear to me if the proposal is about adding OR support only for SAI 
>> index or for other types of queries too.
>> In the past, we had the nasty habit for CQL to provide only partialially 
>> implemented features which resulted in a bad user experience.
>> Some examples are:
>> * LIKE restrictions which were introduced for the need of SASI and were not 
>> never supported for other type of queries
>> * IS NOT NULL restrictions for MATERIALIZED VIEWS that are not supported 
>> elsewhere
>> * != operator only supported for conditional inserts or updates
>> And there are unfortunately many more.
>>  
>> We are currenlty slowly trying to fix those issue and make CQL a more mature 
>> language. By consequence, I would like that we change our way of doing 
>> things. If we introduce support for OR it should also cover all the other 
>> type of queries and be fully tested.
>> I also believe that it is a feature that due to its complexity fully 
>> deserves its own CEP.
>>  
>>  
>> The current code that would be submitted for review after the CEP is 
>> adopted, contains OR support beyond just SAI indexes. An initial 
>> implementation first targeted only such queries where all columns in a WHERE 
>> clause using OR needed to be backed by an SAI index. This was since extended 
>> to also support ALLOW FILTERING mode as well as OR with clustering key 
>> columns. The current implementation is by no means perfect as a general 
>> purpose OR support, the focus all the time was on implementing OR support in 
>> SAI. I'll leave it to others to enumerate exactly the limitations of the 
>> current implementation.
>>  
>> Seeing that also Benedict supports your point of view, I would steer the 
>> conversation more into a project management perspective:
>> * How can we advance CEP-7 so that the bulk of the SAI code can still be 
>> added to Cassandra, so that  users can benefit from this new index type, 
>> albeit without OR?
>> * This is also an important question from the point of view that this is a 
>> large block of code that will inevitably diverged if it's not in trunk. 
>> Also, merging it to trunk will allow future enhancements, including the OR 
>> syntax btw, to happen against trunk (aka upstream first).
>> * Since OR support nevertheless is a feature of SAI, it needs to be at least 
>> unit tested, but ideally even would be exposed so that it is possible to 
>> test on the CQL level. Is there some mechanism such as experimental flags, 
>> which would allow the SAI-only OR support to be merged into trunk, while a 
>> separate CEP is focused on implementing "proper" general purpose OR support? 
>> I should note that there is no guarantee that the OR CEP would be 
>> implemented in time for the next release. So the answer to this point needs 
>> to be something that doesn't violate the desire for good user experience.
>>  
>> henrik
>>  
>>  
> 

Reply via email to