Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Sylvain Lebresne
For the record, in case I was unclear, it was never my intention to
suggest that we shouldn't warn about MVs: I would agree that we still
should and I'm happy that we do. I would also agree that the remaining
caveats and limitations should be more clearly documented.

But, I kind of got the feeling that people were trying to justify
taking what I consider somewhat drastic measures (disabling MVs by
default _in a patch release_) by piling on on how bad MV were and how
impossible it was for anyone ever to use them without dying of a
horrible death. This, to me, felt a bit unfair to the hard work that
has gone into fixing the more blatant problems.

Tl;dr, MVs are certainly not perfect (spoiler alert: they probably
will never be) but they are now imo in a state where some users can
use them productively, so OK to warn about their remaining
problems/limitations, but not ok for me to risk breaking existing user
in a patch release.


On Wed, Oct 4, 2017 at 3:31 AM, Benedict Elliott Smith
<_...@belliottsmith.com> wrote:
> So, I'm of the opinion there's a difference between users misusing a well 
> understood feature whose shortcomings are widely discussed in the community, 
> and providing a feature we don't fully understand, have not fully documented 
> the caveats of, let alone discovered all the problems with nor had that 
> knowledge percolate fully into the wider community.
>
> I also think there's a huge difference between users shooting themselves in 
> the foot, and us shooting them in the foot.
>
> There's a degree of trust - undeserved - that goes with being a database.  
> People assume you're smarter than them, and that it Just Works.  Given this, 
> and that squandering this trust as a bad thing, I personally believe it is 
> better to offer the feature as experimental until we iron out all of the 
> problems, fully understand it, and have a wider community knowledge base 
> around it.
>
> We can still encourage users that can tolerate problems to use it, but we 
> won't be giving any false assurances to those that don't.  Doesn't that seem 
> like a win-win?
>
>
>
>> On 3 Oct 2017, at 21:07, Jeremiah D Jordan  wrote:
>>
>> So for some perspective here, how do users who do not get the guarantees of 
>> MV’s implement this on their own?  They used logged batches.
>>
>> Pseudo CQL here, but you should get the picture:
>>
>> If they don’t ever update data, they do it like so, and it is pretty safe:
>> BEGIN BATCH
>> INSERT tablea blah
>> INSERT tableb blahview
>> END BATCH
>>
>> If they do update data, they likely do it like so, and get it wrong in the 
>> face of concurrency:
>> SELECT * from tablea WHERE blah;
>>
>> BEGIN BATCH
>> INSERT tablea blah
>> INSERT tableb blahview
>> DELETE tableb oldblahview
>> END BATCH
>>
>> A sophisticated user that understands the concurrency issues may well try to 
>> implement it like so:
>>
>> SELECT key, col1, col2 FROM tablea WHERE key=blah;
>>
>> BEGIN BATCH
>> UPDATE tablea col1=new1, col2=new2 WHERE key=blah IF col1=old1 and col2=old2
>> UPDATE tableb viewc1=new2, viewc2=blah WHERE key=new1
>> DELETE tableb WHERE key=old1
>> END BATCH
>>
>> And it wouldn’t work because you can only use LWT in a BATCH if all updates 
>> have the same partition key value, and the whole point of a view most of the 
>> time is that it doesn't (and there are other issues with this, like most 
>> likely needing to use uuid’s or something else to distinguish between 
>> concurrent updates, that are not realized until it is too late).
>>
>> A user who does not dig in and understand how MV’s work, most likely also 
>> does not dig in to understand the trade offs and draw backs of logged 
>> batches to multiple tables across different partition keys.  Or even 
>> necessarily of read before writes, and concurrent updates and the races 
>> inherent in them.  I would guess that using MV’s, even as they are today is 
>> *safer* for these users than rolling their own.  I have seen these patterns 
>> implemented by people many times, including the “broken in the face of 
>> concurrency” version.  So lets please not try to argue that a casual user 
>> that does not dig in to the specifics of feature A is going dig in and 
>> understand the specifics of any other features.  So yes, I would prefer my 
>> bank to use MV’s as they are today over rolling their own, and getting it 
>> even more wrong.
>>
>> Now, even given all that, if we want to warn users of the pit falls of using 
>> MV’s, then lets do that.  But lets keep some perspective on how things 
>> actually get used.
>>
>> -Jeremiah
>>
>>> On Oct 3, 2017, at 8:12 PM, Benedict Elliott Smith <_...@belliottsmith.com> 
>>> wrote:
>>>
>>> While many users may apparently be using MVs successfully, the problem is 
>>> how few (if any) know what guarantees they are getting.  Since we aren’t 
>>> even absolutely certain ourselves, it cannot be many.  Most of the 
>>> shortcomings we are aware of are complicated, concern failure scenarios and 
>>> 

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Mick Semb Wever
> > CDC sounds like it is in the same basket, but it already has the
> > `cdc_enabled` yaml flag which defaults false.
>
> I went this route because I was incredibly wary of changing the CL
> code and wanted to shield non-CDC users from any and all risk I
> reasonably could.


This approach so far is my favourite. (Thanks Josh.)

The flag name `cdc_enabled` is simple and, without adjectives, does not
imply "experimental" or "beta" or anything like that.
It does make life easier for both operators and the C* developers.

I'm also fond of how Apache projects often vote both on the release as well
as its stability flag: Alpha|Beta|GA (General Availability).
https://httpd.apache.org/dev/release.html
http://www.apache.org/legal/release-policy.html#release-types

Given the importance of The Database, i'd be keen to see attached such
community-agreed quality references. And going further, not just to the
releases but also to substantial new features (those yet to reach GA). Then
the downloads page could provide a table something like
https://paste.apache.org/FzrQ

It's just one idea to throw out there, and while it hijacks the thread a
bit, it could even with just the quality tag on releases go a long way with
user trust. Especially if we really are humble about it and use GA
appropriately. For example I'm perfectly happy using a beta in production
if I see the community otherwise has good processes in place and there's
strong testing and staging resources to take advantage of. And as Kurt has
implied many users are indeed smart and wise enough to know how to safely
test and cautiously use even alpha features in production.

Anyway, with or without the above idea, yaml flag names that don't
use adjectives could address Kurt's concerns about pulling the rug from
under the feet of existing users. Such a flag is but a small improvement
suitable for a minor release (you must read the NEWS.txt before even a
patch upgrade), and the documentation is only making explicit what should
have been all along. Users shouldn't feel that we're returning features
into "alpha|beta" mode when what we're actually doing is improving the
community's quality assurance documentation.

Mick


Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread kurt greaves
>
> The flag name `cdc_enabled` is simple and, without adjectives, does not
> imply "experimental" or "beta" or anything like that.
> It does make life easier for both operators and the C* developers.

I would be all for a mv_enabled option, assuming it's enabled by default
for all existing branches. I don't think saying that you are meant to read
NEWS.txt before upgrading a patch is acceptable. Most people don't, and
expecting them to is a bit insane. Also Assuming that if they read it
they'd understand all implications is also a bit questionable. If deemed
suitable to turn it off that can be done in the next major/minor, but I
think that would be unlikely, as we should really require sufficient
evidence that it's dangerous which I just don't think we have. I'm still of
the opinion that MV in their current state are no worse off than a lot of
other features, and marking them as experimental and disabling now would
just be detrimental to their development and annoy users. Also if we give
them that treatment then there a whole load of other defaults we should
change and disable which is just not acceptable in a patch release. It's
not really necessary anyway, we don't have anyone crying bloody murder on
the mailing list about how everything went to hell because they used
feature x.

No one has really provided any counter evidence yet that MV's are in some
awful state and they are going to shoot users. There are a few existing
issues that I've brought up already, but they are really quite minor,
nothing comparable to "lol you can't repair if you use vnodes, sorry". I
think we really need some real examples/evidence before making calls like
"lets disable this feature in a patch release and mark it experimental"

>  I personally believe it is better to offer the feature as experimental
> until we iron out all of the problems

What problems are you referring to, and how exactly will we know when all
of them have been sufficiently ironed? If we mark it as experimental how
exactly are we going to get people to use said feature to find issues?
​


Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Josh McKenzie
> and providing a feature we don't fully understand, have not fully
documented the caveats of, let alone discovered all the problems with nor
had that knowledge percolate fully into the wider community.
There appear to be varying levels of understanding of the implementation
details of MV's (that seem to directly correlate with faith in the
feature's correctness for the use-cases recommended) on this email thread
so while I respect a sense of general wariness about the state of
correctness testing with C*, I don't agree that the thoroughness of testing
of MV's is any different than any other feature we've added to the
code-base since the project's inception.

That's not to say I think the current extent of our testing before GA on
features is adequate; I don't, but I don't think it makes sense to draw an
arbitrary line in the sand with already released features that are in use
in production clusters, flagging said features as experimental after the
fact, and thus eroding users' trust in our collective definition of done.
What's to stop us from flagging other, seemingly arbitrary features people
are relying on in production as experimental in the future? What does that
mean for their faith in the project and their job security? SASI? LWT?
Counters? Triggers? Repair and compaction due to (still arising) edge-cases
and defects in early re-open and incremental repair? All of these features
still have edge-cases due to the inherent complexity of the code-base and
problem domain in which we work.

Right now there appear to be the two camps of 'I can't clearly articulate
what Good Enough is since it's Complicated, but I know we're not there' and
'if people are relying on it in production without issue it's by definition
good enough for their use-case'. It's a compromise; nothing is ever perfect
(as we all know). I'm all for us saying 'We need better testing of features
going forward', 'We need better metrics for the coverage and branch testing
of things in C*', etc, and definitely in favor of us spending some time to
increase our coverage for existing features.

I don't think MV's are any different than anything else in this code-base
in terms of how well vetted the features are, for better or for worse.

On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves  wrote:

> >
> > The flag name `cdc_enabled` is simple and, without adjectives, does not
> > imply "experimental" or "beta" or anything like that.
> > It does make life easier for both operators and the C* developers.
>
> I would be all for a mv_enabled option, assuming it's enabled by default
> for all existing branches. I don't think saying that you are meant to read
> NEWS.txt before upgrading a patch is acceptable. Most people don't, and
> expecting them to is a bit insane. Also Assuming that if they read it
> they'd understand all implications is also a bit questionable. If deemed
> suitable to turn it off that can be done in the next major/minor, but I
> think that would be unlikely, as we should really require sufficient
> evidence that it's dangerous which I just don't think we have. I'm still of
> the opinion that MV in their current state are no worse off than a lot of
> other features, and marking them as experimental and disabling now would
> just be detrimental to their development and annoy users. Also if we give
> them that treatment then there a whole load of other defaults we should
> change and disable which is just not acceptable in a patch release. It's
> not really necessary anyway, we don't have anyone crying bloody murder on
> the mailing list about how everything went to hell because they used
> feature x.
>
> No one has really provided any counter evidence yet that MV's are in some
> awful state and they are going to shoot users. There are a few existing
> issues that I've brought up already, but they are really quite minor,
> nothing comparable to "lol you can't repair if you use vnodes, sorry". I
> think we really need some real examples/evidence before making calls like
> "lets disable this feature in a patch release and mark it experimental"
>
> >  I personally believe it is better to offer the feature as experimental
> > until we iron out all of the problems
>
> What problems are you referring to, and how exactly will we know when all
> of them have been sufficiently ironed? If we mark it as experimental how
> exactly are we going to get people to use said feature to find issues?
> ​
>


Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Stefan Podkowinski
Introducing feature flags for enabling or disabling different code paths
is not sustainable in the long run. It's hard enough to keep up with
integration testing with the couple of Jenkins jobs that we have.
Running jobs for all permutations of flags that we keep around, would
turn out impractical. But if we don't, I'm pretty sure something will
fall off the radar and it won't take long until someone reports that
enabling feature X after the latest upgrade will simply not work anymore.

There may also be some more subtle assumptions and cross dependencies
between features that may cause side effects by disabling a feature (or
parts of it), even if it's just e.g. a metric value that suddenly won't
get updated anymore, but is used somewhere else. We'll also have to
consider migration paths for turning a feature on and off again without
causing any downtime. If I was to turn on e.g. MVs on a single node in
my cluster, then this should not cause any issues on the other nodes
that still have MV code paths disabled. Again, this would need to be tested.

So to be clear, my point is that any flags should be implemented in a
really non-invasive way on the user facing side only, e.g. by emitting a
log message or cqlsh error. At this point, I'm not really sure if it
would be a good idea to add them to cassandra.yaml, as I'm pretty sure
that eventually they will be used to change the behaviour of our code,
beside printing a log message.


On 04.10.17 10:03, Mick Semb Wever wrote:
>>> CDC sounds like it is in the same basket, but it already has the
>>> `cdc_enabled` yaml flag which defaults false.
>> I went this route because I was incredibly wary of changing the CL
>> code and wanted to shield non-CDC users from any and all risk I
>> reasonably could.
>
> This approach so far is my favourite. (Thanks Josh.)
>
> The flag name `cdc_enabled` is simple and, without adjectives, does not
> imply "experimental" or "beta" or anything like that.
> It does make life easier for both operators and the C* developers.
>
> I'm also fond of how Apache projects often vote both on the release as well
> as its stability flag: Alpha|Beta|GA (General Availability).
> https://httpd.apache.org/dev/release.html
> http://www.apache.org/legal/release-policy.html#release-types
>
> Given the importance of The Database, i'd be keen to see attached such
> community-agreed quality references. And going further, not just to the
> releases but also to substantial new features (those yet to reach GA). Then
> the downloads page could provide a table something like
> https://paste.apache.org/FzrQ
>
> It's just one idea to throw out there, and while it hijacks the thread a
> bit, it could even with just the quality tag on releases go a long way with
> user trust. Especially if we really are humble about it and use GA
> appropriately. For example I'm perfectly happy using a beta in production
> if I see the community otherwise has good processes in place and there's
> strong testing and staging resources to take advantage of. And as Kurt has
> implied many users are indeed smart and wise enough to know how to safely
> test and cautiously use even alpha features in production.
>
> Anyway, with or without the above idea, yaml flag names that don't
> use adjectives could address Kurt's concerns about pulling the rug from
> under the feet of existing users. Such a flag is but a small improvement
> suitable for a minor release (you must read the NEWS.txt before even a
> patch upgrade), and the documentation is only making explicit what should
> have been all along. Users shouldn't feel that we're returning features
> into "alpha|beta" mode when what we're actually doing is improving the
> community's quality assurance documentation.
>
> Mick
>


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Benedict Elliott Smith
So, as the author of one of the disasters you mention (early re-open), I would 
prefer to learn from the mistake and not repeat it.  Unfortunately we seem to 
be in the habit of repeating it, and that feature was a lot *lot* simpler.

Let’s not kid ourselves: MVs are by far and away the most complicated feature 
we have ever delivered.  We do not fully understand it, even in theory, let 
alone can we be sure we have the implementation right.

So, if we all agree our testing is ordinarily insufficient, can’t we agree it 
is probably *really* insufficient here?

I don’t want to give the impression I’m shifting the goals.  I’ve been against 
MV inclusion as they stand for some time, as were several others.  I think in 
the new world order of project/community structure, they probably would have 
been rejected as they stand.

I’ve consistently listed my own requirements for considering them production 
ready:  extensive modelling and simulation of the algorithm’s properties (in 
lieu of formal proofs), *safe* default behaviour (rollback CASSANDRA-10230, or 
make it a per-table option, and default to fast only for existing tables to 
avoid surprise), tools for detecting and repairing inconsistencies, and more 
extensive testing.

Many of these things were agreed as prerequisites for release of 3.0, but 
ultimately they were not delivered.

I do, however, absolutely agree with Sylvain that we need to minimise surprise 
in a patch version.


On 4 Oct 2017, at 08:58, Josh McKenzie  wrote:

>> and providing a feature we don't fully understand, have not fully
> documented the caveats of, let alone discovered all the problems with nor
> had that knowledge percolate fully into the wider community.
> There appear to be varying levels of understanding of the implementation
> details of MV's (that seem to directly correlate with faith in the
> feature's correctness for the use-cases recommended) on this email thread
> so while I respect a sense of general wariness about the state of
> correctness testing with C*, I don't agree that the thoroughness of testing
> of MV's is any different than any other feature we've added to the
> code-base since the project's inception.
> 
> That's not to say I think the current extent of our testing before GA on
> features is adequate; I don't, but I don't think it makes sense to draw an
> arbitrary line in the sand with already released features that are in use
> in production clusters, flagging said features as experimental after the
> fact, and thus eroding users' trust in our collective definition of done.
> What's to stop us from flagging other, seemingly arbitrary features people
> are relying on in production as experimental in the future? What does that
> mean for their faith in the project and their job security? SASI? LWT?
> Counters? Triggers? Repair and compaction due to (still arising) edge-cases
> and defects in early re-open and incremental repair? All of these features
> still have edge-cases due to the inherent complexity of the code-base and
> problem domain in which we work.
> 
> Right now there appear to be the two camps of 'I can't clearly articulate
> what Good Enough is since it's Complicated, but I know we're not there' and
> 'if people are relying on it in production without issue it's by definition
> good enough for their use-case'. It's a compromise; nothing is ever perfect
> (as we all know). I'm all for us saying 'We need better testing of features
> going forward', 'We need better metrics for the coverage and branch testing
> of things in C*', etc, and definitely in favor of us spending some time to
> increase our coverage for existing features.
> 
> I don't think MV's are any different than anything else in this code-base
> in terms of how well vetted the features are, for better or for worse.
> 
> On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves  wrote:
> 
>>> 
>>> The flag name `cdc_enabled` is simple and, without adjectives, does not
>>> imply "experimental" or "beta" or anything like that.
>>> It does make life easier for both operators and the C* developers.
>> 
>> I would be all for a mv_enabled option, assuming it's enabled by default
>> for all existing branches. I don't think saying that you are meant to read
>> NEWS.txt before upgrading a patch is acceptable. Most people don't, and
>> expecting them to is a bit insane. Also Assuming that if they read it
>> they'd understand all implications is also a bit questionable. If deemed
>> suitable to turn it off that can be done in the next major/minor, but I
>> think that would be unlikely, as we should really require sufficient
>> evidence that it's dangerous which I just don't think we have. I'm still of
>> the opinion that MV in their current state are no worse off than a lot of
>> other features, and marking them as experimental and disabling now would
>> just be detrimental to their development and annoy users. Also if we give
>> them that treatment then there a whole load of other defaults we should
>>

Re: [VOTE PASSED] Release Apache Cassandra 2.1.19

2017-10-04 Thread Michael Shuler
With 10 binding +1, 1 non-binding +1, and no other votes, this vote
passes. I'll get the artifacts published today.

-- 
Kind regards,
Michael

On 09/28/2017 01:11 PM, Michael Shuler wrote:
> I propose the following artifacts for release as 2.1.19.
> 
> sha1: 428eaa3e37cab7227c81fdf124d29dfc1db4257c
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.1.19-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1148/org/apache/cassandra/apache-cassandra/2.1.19/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1148/
> 
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
> 
> The vote will be open for 72 hours (longer if needed).
> 
> [1]: (CHANGES.txt) https://goo.gl/1sZLdP (also RPM file ownership fix)
> [2]: (NEWS.txt) https://goo.gl/YKEuRc
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE PASSED] Release Apache Cassandra 2.2.11

2017-10-04 Thread Michael Shuler
With 10 binding +1, 3 non-binding +1, and no other votes, this vote
passes. I'll get the artifacts published today.

-- 
Kind regards,
Michael

On 09/28/2017 01:40 PM, Michael Shuler wrote:
> I propose the following artifacts for release as 2.2.11.
> 
> sha1: c510e001481637e1f74d9ad176f8dc3ab7ebd1e3
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.2.11-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1149/org/apache/cassandra/apache-cassandra/2.2.11/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1149/
> 
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
> 
> The vote will be open for 72 hours (longer if needed).
> 
> [1]: (CHANGES.txt) https://goo.gl/qTG7xU (also RPM file ownership fix)
> [2]: (NEWS.txt) https://goo.gl/ggdkLH
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Josh McKenzie
I don't agree at face value that early re-open is in sum a lot simpler than
MV, or that adding CQL and deprecating Thrift was a lot simpler, or the
8099 refactor, etc. Different types of complexity, certainly, and MV's are
arguably harder to prove correct due to surface area of exposure to failure
states. Definitions of complexity aside, I do agree with the general
principle that MV's are very complex and, as with many other things in the
DB, boundary conditions are insufficiently understood and tested at this
time. There's also a recency bias to the defects and active work people are
seeing with MV as there has been a recent focus on stabilizing that rather
than with the long tail we've seen with other, more pervasive and
foundational changes to the code-base over the course of the past few years.

MV's aren't the only thing in the DB that I think qualify for 'flagging as
not-production-ready' by the criteria people are attempting to selectively
apply to the feature here. If we go the route of flagging one already
released feature experimental because we lack confidence in it, there are
other things we similarly lack confidence in that should be treated
similarly (incremental repair, SASI to name two that immediately come to
mind). I personally don't think changing the qualification and user
experience of features post-release sends a good message to said users; if
we all agreed unanimously that these features were this failure-prone and
high-risk, it would be more appropriate to make that change however that's
obviously not the case here.


On Wed, Oct 4, 2017 at 10:41 AM, Benedict Elliott Smith <_...@belliottsmith.com
> wrote:

> So, as the author of one of the disasters you mention (early re-open), I
> would prefer to learn from the mistake and not repeat it.  Unfortunately we
> seem to be in the habit of repeating it, and that feature was a lot *lot*
> simpler.
>
> Let’s not kid ourselves: MVs are by far and away the most complicated
> feature we have ever delivered.  We do not fully understand it, even in
> theory, let alone can we be sure we have the implementation right.
>
> So, if we all agree our testing is ordinarily insufficient, can’t we agree
> it is probably *really* insufficient here?
>
> I don’t want to give the impression I’m shifting the goals.  I’ve been
> against MV inclusion as they stand for some time, as were several others.
> I think in the new world order of project/community structure, they
> probably would have been rejected as they stand.
>
> I’ve consistently listed my own requirements for considering them
> production ready:  extensive modelling and simulation of the algorithm’s
> properties (in lieu of formal proofs), *safe* default behaviour (rollback
> CASSANDRA-10230, or make it a per-table option, and default to fast only
> for existing tables to avoid surprise), tools for detecting and repairing
> inconsistencies, and more extensive testing.
>
> Many of these things were agreed as prerequisites for release of 3.0, but
> ultimately they were not delivered.
>
> I do, however, absolutely agree with Sylvain that we need to minimise
> surprise in a patch version.
>
>
> On 4 Oct 2017, at 08:58, Josh McKenzie  wrote:
>
> >> and providing a feature we don't fully understand, have not fully
> > documented the caveats of, let alone discovered all the problems with nor
> > had that knowledge percolate fully into the wider community.
> > There appear to be varying levels of understanding of the implementation
> > details of MV's (that seem to directly correlate with faith in the
> > feature's correctness for the use-cases recommended) on this email thread
> > so while I respect a sense of general wariness about the state of
> > correctness testing with C*, I don't agree that the thoroughness of
> testing
> > of MV's is any different than any other feature we've added to the
> > code-base since the project's inception.
> >
> > That's not to say I think the current extent of our testing before GA on
> > features is adequate; I don't, but I don't think it makes sense to draw
> an
> > arbitrary line in the sand with already released features that are in use
> > in production clusters, flagging said features as experimental after the
> > fact, and thus eroding users' trust in our collective definition of done.
> > What's to stop us from flagging other, seemingly arbitrary features
> people
> > are relying on in production as experimental in the future? What does
> that
> > mean for their faith in the project and their job security? SASI? LWT?
> > Counters? Triggers? Repair and compaction due to (still arising)
> edge-cases
> > and defects in early re-open and incremental repair? All of these
> features
> > still have edge-cases due to the inherent complexity of the code-base and
> > problem domain in which we work.
> >
> > Right now there appear to be the two camps of 'I can't clearly articulate
> > what Good Enough is since it's Complicated, but I know we're not there'
> and
> > 'if pe

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Aleksey Yeshchenko
We already have those for UDFs and CDC.

We should have more: for triggers, SASI, and MVs, at least. Operators need a 
way to disable features they haven’t validated.

We already have sufficient consensus to introduce the flags, and we should. 
There also seems to be sufficient consensus on emitting warnings.

The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree with 
Sylvain that flipping the default in a minor would be invasive. We shouldn’t do 
that.

For trunk, though, I think we should default to off. When it comes to releasing 
4.0 we can collectively decide if there is sufficient trust in MVs at the time 
to warrant flipping the default to true. Ultimately we can decide this in a PMC 
vote. If I misread the consensus regarding the default for 4.0, then we might 
as well vote on that. What I see is sufficient distrust coming from core 
committers, including the author of the v1 design, to warrant opt-in for MVs.

If we don’t trust in them as developers, we shouldn’t be cavalier with the 
users, either. Not until that trust is gained/regained.

—
AY

On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org) wrote:

Introducing feature flags for enabling or disabling different code paths  
is not sustainable in the long run. It's hard enough to keep up with  
integration testing with the couple of Jenkins jobs that we have.  
Running jobs for all permutations of flags that we keep around, would  
turn out impractical. But if we don't, I'm pretty sure something will  
fall off the radar and it won't take long until someone reports that  
enabling feature X after the latest upgrade will simply not work anymore.  

There may also be some more subtle assumptions and cross dependencies  
between features that may cause side effects by disabling a feature (or  
parts of it), even if it's just e.g. a metric value that suddenly won't  
get updated anymore, but is used somewhere else. We'll also have to  
consider migration paths for turning a feature on and off again without  
causing any downtime. If I was to turn on e.g. MVs on a single node in  
my cluster, then this should not cause any issues on the other nodes  
that still have MV code paths disabled. Again, this would need to be tested.  

So to be clear, my point is that any flags should be implemented in a  
really non-invasive way on the user facing side only, e.g. by emitting a  
log message or cqlsh error. At this point, I'm not really sure if it  
would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
that eventually they will be used to change the behaviour of our code,  
beside printing a log message.  


On 04.10.17 10:03, Mick Semb Wever wrote:  
>>> CDC sounds like it is in the same basket, but it already has the  
>>> `cdc_enabled` yaml flag which defaults false.  
>> I went this route because I was incredibly wary of changing the CL  
>> code and wanted to shield non-CDC users from any and all risk I  
>> reasonably could.  
>  
> This approach so far is my favourite. (Thanks Josh.)  
>  
> The flag name `cdc_enabled` is simple and, without adjectives, does not  
> imply "experimental" or "beta" or anything like that.  
> It does make life easier for both operators and the C* developers.  
>  
> I'm also fond of how Apache projects often vote both on the release as well  
> as its stability flag: Alpha|Beta|GA (General Availability).  
> https://httpd.apache.org/dev/release.html  
> http://www.apache.org/legal/release-policy.html#release-types  
>  
> Given the importance of The Database, i'd be keen to see attached such  
> community-agreed quality references. And going further, not just to the  
> releases but also to substantial new features (those yet to reach GA). Then  
> the downloads page could provide a table something like  
> https://paste.apache.org/FzrQ  
>  
> It's just one idea to throw out there, and while it hijacks the thread a  
> bit, it could even with just the quality tag on releases go a long way with  
> user trust. Especially if we really are humble about it and use GA  
> appropriately. For example I'm perfectly happy using a beta in production  
> if I see the community otherwise has good processes in place and there's  
> strong testing and staging resources to take advantage of. And as Kurt has  
> implied many users are indeed smart and wise enough to know how to safely  
> test and cautiously use even alpha features in production.  
>  
> Anyway, with or without the above idea, yaml flag names that don't  
> use adjectives could address Kurt's concerns about pulling the rug from  
> under the feet of existing users. Such a flag is but a small improvement  
> suitable for a minor release (you must read the NEWS.txt before even a  
> patch upgrade), and the documentation is only making explicit what should  
> have been all along. Users shouldn't feel that we're returning features  
> into "alpha|beta" mode when what we're actually doing is improving the  
> comm

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jonathan Haddad
I agree with Aleksey on all points here. Adding that we should update the
docs with warnings about the potential issues with correctness.
On Wed, Oct 4, 2017 at 8:25 AM Aleksey Yeshchenko  wrote:

> We already have those for UDFs and CDC.
>
> We should have more: for triggers, SASI, and MVs, at least. Operators need
> a way to disable features they haven’t validated.
>
> We already have sufficient consensus to introduce the flags, and we
> should. There also seems to be sufficient consensus on emitting warnings.
>
> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree
> with Sylvain that flipping the default in a minor would be invasive. We
> shouldn’t do that.
>
> For trunk, though, I think we should default to off. When it comes to
> releasing 4.0 we can collectively decide if there is sufficient trust in
> MVs at the time to warrant flipping the default to true. Ultimately we can
> decide this in a PMC vote. If I misread the consensus regarding the default
> for 4.0, then we might as well vote on that. What I see is sufficient
> distrust coming from core committers, including the author of the v1
> design, to warrant opt-in for MVs.
>
> If we don’t trust in them as developers, we shouldn’t be cavalier with the
> users, either. Not until that trust is gained/regained.
>
> —
> AY
>
> On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org) wrote:
>
> Introducing feature flags for enabling or disabling different code paths
> is not sustainable in the long run. It's hard enough to keep up with
> integration testing with the couple of Jenkins jobs that we have.
> Running jobs for all permutations of flags that we keep around, would
> turn out impractical. But if we don't, I'm pretty sure something will
> fall off the radar and it won't take long until someone reports that
> enabling feature X after the latest upgrade will simply not work anymore.
>
> There may also be some more subtle assumptions and cross dependencies
> between features that may cause side effects by disabling a feature (or
> parts of it), even if it's just e.g. a metric value that suddenly won't
> get updated anymore, but is used somewhere else. We'll also have to
> consider migration paths for turning a feature on and off again without
> causing any downtime. If I was to turn on e.g. MVs on a single node in
> my cluster, then this should not cause any issues on the other nodes
> that still have MV code paths disabled. Again, this would need to be
> tested.
>
> So to be clear, my point is that any flags should be implemented in a
> really non-invasive way on the user facing side only, e.g. by emitting a
> log message or cqlsh error. At this point, I'm not really sure if it
> would be a good idea to add them to cassandra.yaml, as I'm pretty sure
> that eventually they will be used to change the behaviour of our code,
> beside printing a log message.
>
>
> On 04.10.17 10:03, Mick Semb Wever wrote:
> >>> CDC sounds like it is in the same basket, but it already has the
> >>> `cdc_enabled` yaml flag which defaults false.
> >> I went this route because I was incredibly wary of changing the CL
> >> code and wanted to shield non-CDC users from any and all risk I
> >> reasonably could.
> >
> > This approach so far is my favourite. (Thanks Josh.)
> >
> > The flag name `cdc_enabled` is simple and, without adjectives, does not
> > imply "experimental" or "beta" or anything like that.
> > It does make life easier for both operators and the C* developers.
> >
> > I'm also fond of how Apache projects often vote both on the release as
> well
> > as its stability flag: Alpha|Beta|GA (General Availability).
> > https://httpd.apache.org/dev/release.html
> > http://www.apache.org/legal/release-policy.html#release-types
> >
> > Given the importance of The Database, i'd be keen to see attached such
> > community-agreed quality references. And going further, not just to the
> > releases but also to substantial new features (those yet to reach GA).
> Then
> > the downloads page could provide a table something like
> > https://paste.apache.org/FzrQ
> >
> > It's just one idea to throw out there, and while it hijacks the thread a
> > bit, it could even with just the quality tag on releases go a long way
> with
> > user trust. Especially if we really are humble about it and use GA
> > appropriately. For example I'm perfectly happy using a beta in production
> > if I see the community otherwise has good processes in place and there's
> > strong testing and staging resources to take advantage of. And as Kurt
> has
> > implied many users are indeed smart and wise enough to know how to safely
> > test and cautiously use even alpha features in production.
> >
> > Anyway, with or without the above idea, yaml flag names that don't
> > use adjectives could address Kurt's concerns about pulling the rug from
> > under the feet of existing users. Such a flag is but a small improvement
> > suitable for a minor release (you must read the NEWS.txt 

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Benedict Elliott Smith
Oh, come on. You're being disingenuous.

I invented both algorithms, so I get some say in which is more complex.  I 
fully understand the behaviour of early reopen and can explain it to a lay 
person in around five minutes.  Last time I posted an analysis of MVs it took 
me several days to get it straight in my head just enough to be sure the novel 
problems I was pointing out existed - and in no way did I have confidence I had 
established all the problems.  It wasn't until well after it was completed we 
realised it had some hugely fundamental limitations around primary keys.  I 
would NOT be able to explain the algorithm or its implications to a lay person 
AT ALL.

That said, I would absolutely be comfortable marking incremental repair and 
SASI experimental if this is required to cover MVs with the moniker.  The 
former is less complex than  MVs, but It fits a similar category of complex 
distributed systems implications we hadn't properly modelled. It *has* now had 
extensive testing in the wild though. Conversely SASI has had very little burn 
test, but employs fairly well established approaches, and suffers from very 
little distributed systems complexity.

> On 4 Oct 2017, at 11:12, Josh McKenzie  wrote:
> 
> I don't agree at face value that early re-open is in sum a lot simpler than
> MV, or that adding CQL and deprecating Thrift was a lot simpler, or the
> 8099 refactor, etc. Different types of complexity, certainly, and MV's are
> arguably harder to prove correct due to surface area of exposure to failure
> states. Definitions of complexity aside, I do agree with the general
> principle that MV's are very complex and, as with many other things in the
> DB, boundary conditions are insufficiently understood and tested at this
> time. There's also a recency bias to the defects and active work people are
> seeing with MV as there has been a recent focus on stabilizing that rather
> than with the long tail we've seen with other, more pervasive and
> foundational changes to the code-base over the course of the past few years.
> 
> MV's aren't the only thing in the DB that I think qualify for 'flagging as
> not-production-ready' by the criteria people are attempting to selectively
> apply to the feature here. If we go the route of flagging one already
> released feature experimental because we lack confidence in it, there are
> other things we similarly lack confidence in that should be treated
> similarly (incremental repair, SASI to name two that immediately come to
> mind). I personally don't think changing the qualification and user
> experience of features post-release sends a good message to said users; if
> we all agreed unanimously that these features were this failure-prone and
> high-risk, it would be more appropriate to make that change however that's
> obviously not the case here.
> 
> 
> On Wed, Oct 4, 2017 at 10:41 AM, Benedict Elliott Smith 
> <_...@belliottsmith.com
>> wrote:
> 
>> So, as the author of one of the disasters you mention (early re-open), I
>> would prefer to learn from the mistake and not repeat it.  Unfortunately we
>> seem to be in the habit of repeating it, and that feature was a lot *lot*
>> simpler.
>> 
>> Let’s not kid ourselves: MVs are by far and away the most complicated
>> feature we have ever delivered.  We do not fully understand it, even in
>> theory, let alone can we be sure we have the implementation right.
>> 
>> So, if we all agree our testing is ordinarily insufficient, can’t we agree
>> it is probably *really* insufficient here?
>> 
>> I don’t want to give the impression I’m shifting the goals.  I’ve been
>> against MV inclusion as they stand for some time, as were several others.
>> I think in the new world order of project/community structure, they
>> probably would have been rejected as they stand.
>> 
>> I’ve consistently listed my own requirements for considering them
>> production ready:  extensive modelling and simulation of the algorithm’s
>> properties (in lieu of formal proofs), *safe* default behaviour (rollback
>> CASSANDRA-10230, or make it a per-table option, and default to fast only
>> for existing tables to avoid surprise), tools for detecting and repairing
>> inconsistencies, and more extensive testing.
>> 
>> Many of these things were agreed as prerequisites for release of 3.0, but
>> ultimately they were not delivered.
>> 
>> I do, however, absolutely agree with Sylvain that we need to minimise
>> surprise in a patch version.
>> 
>> 
>> On 4 Oct 2017, at 08:58, Josh McKenzie  wrote:
>> 
 and providing a feature we don't fully understand, have not fully
>>> documented the caveats of, let alone discovered all the problems with nor
>>> had that knowledge percolate fully into the wider community.
>>> There appear to be varying levels of understanding of the implementation
>>> details of MV's (that seem to directly correlate with faith in the
>>> feature's correctness for the use-cases recommended) on this email thread
>>> so while I r

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jeremy Hanna
Not to detract from the discussion about whether or not to classify X or Y as 
experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303 
 was originally about 
operators preventing users from abusing features (e.g. allow filtering).  Could 
that concept be extended to features like MVs or SASI or anything else?  On the 
one hand it is nice to be able to set those things dynamically without a 
rolling restart as well as by user.  On the other it’s less clear about 
defaults.  There could be a property file or just in the yaml, the operator 
could specify the default features that are enabled for users and then it could 
be overridden within that framework.

> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko  wrote:
> 
> We already have those for UDFs and CDC.
> 
> We should have more: for triggers, SASI, and MVs, at least. Operators need a 
> way to disable features they haven’t validated.
> 
> We already have sufficient consensus to introduce the flags, and we should. 
> There also seems to be sufficient consensus on emitting warnings.
> 
> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree 
> with Sylvain that flipping the default in a minor would be invasive. We 
> shouldn’t do that.
> 
> For trunk, though, I think we should default to off. When it comes to 
> releasing 4.0 we can collectively decide if there is sufficient trust in MVs 
> at the time to warrant flipping the default to true. Ultimately we can decide 
> this in a PMC vote. If I misread the consensus regarding the default for 4.0, 
> then we might as well vote on that. What I see is sufficient distrust coming 
> from core committers, including the author of the v1 design, to warrant 
> opt-in for MVs.
> 
> If we don’t trust in them as developers, we shouldn’t be cavalier with the 
> users, either. Not until that trust is gained/regained.
> 
> —
> AY
> 
> On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org) wrote:
> 
> Introducing feature flags for enabling or disabling different code paths  
> is not sustainable in the long run. It's hard enough to keep up with  
> integration testing with the couple of Jenkins jobs that we have.  
> Running jobs for all permutations of flags that we keep around, would  
> turn out impractical. But if we don't, I'm pretty sure something will  
> fall off the radar and it won't take long until someone reports that  
> enabling feature X after the latest upgrade will simply not work anymore.  
> 
> There may also be some more subtle assumptions and cross dependencies  
> between features that may cause side effects by disabling a feature (or  
> parts of it), even if it's just e.g. a metric value that suddenly won't  
> get updated anymore, but is used somewhere else. We'll also have to  
> consider migration paths for turning a feature on and off again without  
> causing any downtime. If I was to turn on e.g. MVs on a single node in  
> my cluster, then this should not cause any issues on the other nodes  
> that still have MV code paths disabled. Again, this would need to be tested.  
> 
> So to be clear, my point is that any flags should be implemented in a  
> really non-invasive way on the user facing side only, e.g. by emitting a  
> log message or cqlsh error. At this point, I'm not really sure if it  
> would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
> that eventually they will be used to change the behaviour of our code,  
> beside printing a log message.  
> 
> 
> On 04.10.17 10:03, Mick Semb Wever wrote:  
 CDC sounds like it is in the same basket, but it already has the  
 `cdc_enabled` yaml flag which defaults false.  
>>> I went this route because I was incredibly wary of changing the CL  
>>> code and wanted to shield non-CDC users from any and all risk I  
>>> reasonably could.  
>> 
>> This approach so far is my favourite. (Thanks Josh.)  
>> 
>> The flag name `cdc_enabled` is simple and, without adjectives, does not  
>> imply "experimental" or "beta" or anything like that.  
>> It does make life easier for both operators and the C* developers.  
>> 
>> I'm also fond of how Apache projects often vote both on the release as well  
>> as its stability flag: Alpha|Beta|GA (General Availability).  
>> https://httpd.apache.org/dev/release.html  
>> http://www.apache.org/legal/release-policy.html#release-types  
>> 
>> Given the importance of The Database, i'd be keen to see attached such  
>> community-agreed quality references. And going further, not just to the  
>> releases but also to substantial new features (those yet to reach GA). Then  
>> the downloads page could provide a table something like  
>> https://paste.apache.org/FzrQ  
>> 
>> It's just one idea to throw out there, and while it hijacks the thread a  
>> bit, it could even with just the quality tag on releases go a long way with  
>> user trust. Especially if we really are humble about it and

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Stefan Podkowinski
If "disabling a feature" is just about preventing some CQL from
execution along with a warning log message, I'm fine with that. But if
that's being the case, I don't really understand why making this change
in a minor version would be a problem, since existing MVs wouldn't be
affected anyways and should just work as before, even with the enabled
flag set to false.


On 04.10.17 17:24, Aleksey Yeshchenko wrote:
> We already have those for UDFs and CDC.
>
> We should have more: for triggers, SASI, and MVs, at least. Operators need a 
> way to disable features they haven’t validated.
>
> We already have sufficient consensus to introduce the flags, and we should. 
> There also seems to be sufficient consensus on emitting warnings.
>
> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree 
> with Sylvain that flipping the default in a minor would be invasive. We 
> shouldn’t do that.
>
> For trunk, though, I think we should default to off. When it comes to 
> releasing 4.0 we can collectively decide if there is sufficient trust in MVs 
> at the time to warrant flipping the default to true. Ultimately we can decide 
> this in a PMC vote. If I misread the consensus regarding the default for 4.0, 
> then we might as well vote on that. What I see is sufficient distrust coming 
> from core committers, including the author of the v1 design, to warrant 
> opt-in for MVs.
>
> If we don’t trust in them as developers, we shouldn’t be cavalier with the 
> users, either. Not until that trust is gained/regained.
>
> —
> AY
>
> On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org) wrote:
>
> Introducing feature flags for enabling or disabling different code paths  
> is not sustainable in the long run. It's hard enough to keep up with  
> integration testing with the couple of Jenkins jobs that we have.  
> Running jobs for all permutations of flags that we keep around, would  
> turn out impractical. But if we don't, I'm pretty sure something will  
> fall off the radar and it won't take long until someone reports that  
> enabling feature X after the latest upgrade will simply not work anymore.  
>
> There may also be some more subtle assumptions and cross dependencies  
> between features that may cause side effects by disabling a feature (or  
> parts of it), even if it's just e.g. a metric value that suddenly won't  
> get updated anymore, but is used somewhere else. We'll also have to  
> consider migration paths for turning a feature on and off again without  
> causing any downtime. If I was to turn on e.g. MVs on a single node in  
> my cluster, then this should not cause any issues on the other nodes  
> that still have MV code paths disabled. Again, this would need to be tested.  
>
> So to be clear, my point is that any flags should be implemented in a  
> really non-invasive way on the user facing side only, e.g. by emitting a  
> log message or cqlsh error. At this point, I'm not really sure if it  
> would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
> that eventually they will be used to change the behaviour of our code,  
> beside printing a log message.  
>
>
> On 04.10.17 10:03, Mick Semb Wever wrote:  
 CDC sounds like it is in the same basket, but it already has the  
 `cdc_enabled` yaml flag which defaults false.  
>>> I went this route because I was incredibly wary of changing the CL  
>>> code and wanted to shield non-CDC users from any and all risk I  
>>> reasonably could.  
>>  
>> This approach so far is my favourite. (Thanks Josh.)  
>>  
>> The flag name `cdc_enabled` is simple and, without adjectives, does not  
>> imply "experimental" or "beta" or anything like that.  
>> It does make life easier for both operators and the C* developers.  
>>  
>> I'm also fond of how Apache projects often vote both on the release as well  
>> as its stability flag: Alpha|Beta|GA (General Availability).  
>> https://httpd.apache.org/dev/release.html  
>> http://www.apache.org/legal/release-policy.html#release-types  
>>  
>> Given the importance of The Database, i'd be keen to see attached such  
>> community-agreed quality references. And going further, not just to the  
>> releases but also to substantial new features (those yet to reach GA). Then  
>> the downloads page could provide a table something like  
>> https://paste.apache.org/FzrQ  
>>  
>> It's just one idea to throw out there, and while it hijacks the thread a  
>> bit, it could even with just the quality tag on releases go a long way with  
>> user trust. Especially if we really are humble about it and use GA  
>> appropriately. For example I'm perfectly happy using a beta in production  
>> if I see the community otherwise has good processes in place and there's  
>> strong testing and staging resources to take advantage of. And as Kurt has  
>> implied many users are indeed smart and wise enough to know how to safely  
>> test and cautiously use even alpha features in production

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Benedict Elliott Smith
Can't we promote these behavioural flags to keyspace properties (with suitable 
permissions to edit necessary)?

I agree that enabling/disabling features shouldn't require a rolling restart, 
and nor should switching their consistency safety level.

I think this would be the most suitable equivalent to ALLOW FILTERING for MVs.



> On 4 Oct 2017, at 12:31, Jeremy Hanna  wrote:
> 
> Not to detract from the discussion about whether or not to classify X or Y as 
> experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303 
>  was originally about 
> operators preventing users from abusing features (e.g. allow filtering).  
> Could that concept be extended to features like MVs or SASI or anything else? 
>  On the one hand it is nice to be able to set those things dynamically 
> without a rolling restart as well as by user.  On the other it’s less clear 
> about defaults.  There could be a property file or just in the yaml, the 
> operator could specify the default features that are enabled for users and 
> then it could be overridden within that framework.
> 
>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko  wrote:
>> 
>> We already have those for UDFs and CDC.
>> 
>> We should have more: for triggers, SASI, and MVs, at least. Operators need a 
>> way to disable features they haven’t validated.
>> 
>> We already have sufficient consensus to introduce the flags, and we should. 
>> There also seems to be sufficient consensus on emitting warnings.
>> 
>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree 
>> with Sylvain that flipping the default in a minor would be invasive. We 
>> shouldn’t do that.
>> 
>> For trunk, though, I think we should default to off. When it comes to 
>> releasing 4.0 we can collectively decide if there is sufficient trust in MVs 
>> at the time to warrant flipping the default to true. Ultimately we can 
>> decide this in a PMC vote. If I misread the consensus regarding the default 
>> for 4.0, then we might as well vote on that. What I see is sufficient 
>> distrust coming from core committers, including the author of the v1 design, 
>> to warrant opt-in for MVs.
>> 
>> If we don’t trust in them as developers, we shouldn’t be cavalier with the 
>> users, either. Not until that trust is gained/regained.
>> 
>> —
>> AY
>> 
>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org) wrote:
>> 
>> Introducing feature flags for enabling or disabling different code paths  
>> is not sustainable in the long run. It's hard enough to keep up with  
>> integration testing with the couple of Jenkins jobs that we have.  
>> Running jobs for all permutations of flags that we keep around, would  
>> turn out impractical. But if we don't, I'm pretty sure something will  
>> fall off the radar and it won't take long until someone reports that  
>> enabling feature X after the latest upgrade will simply not work anymore.  
>> 
>> There may also be some more subtle assumptions and cross dependencies  
>> between features that may cause side effects by disabling a feature (or  
>> parts of it), even if it's just e.g. a metric value that suddenly won't  
>> get updated anymore, but is used somewhere else. We'll also have to  
>> consider migration paths for turning a feature on and off again without  
>> causing any downtime. If I was to turn on e.g. MVs on a single node in  
>> my cluster, then this should not cause any issues on the other nodes  
>> that still have MV code paths disabled. Again, this would need to be tested. 
>>  
>> 
>> So to be clear, my point is that any flags should be implemented in a  
>> really non-invasive way on the user facing side only, e.g. by emitting a  
>> log message or cqlsh error. At this point, I'm not really sure if it  
>> would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
>> that eventually they will be used to change the behaviour of our code,  
>> beside printing a log message.  
>> 
>> 
>> On 04.10.17 10:03, Mick Semb Wever wrote:  
> CDC sounds like it is in the same basket, but it already has the  
> `cdc_enabled` yaml flag which defaults false.  
 I went this route because I was incredibly wary of changing the CL  
 code and wanted to shield non-CDC users from any and all risk I  
 reasonably could.  
>>> 
>>> This approach so far is my favourite. (Thanks Josh.)  
>>> 
>>> The flag name `cdc_enabled` is simple and, without adjectives, does not  
>>> imply "experimental" or "beta" or anything like that.  
>>> It does make life easier for both operators and the C* developers.  
>>> 
>>> I'm also fond of how Apache projects often vote both on the release as well 
>>>  
>>> as its stability flag: Alpha|Beta|GA (General Availability).  
>>> https://httpd.apache.org/dev/release.html  
>>> http://www.apache.org/legal/release-policy.html#release-types  
>>> 
>>> Given the importance of The Database, i'd be keen to see attache

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Aleksey Yeshchenko
Yep. Almost!

Changing the default in a minor version might break some scripts/tooling that 
manipulate schema and potentially create new MVs (as dangerous as it currently 
is - manipulating schema in that way), and that would still not be very nice.

Introducing a flag and leaving it at false in a minor is harmless.

—
AY

On 4 October 2017 at 18:24:16, Stefan Podkowinski (s...@apache.org) wrote:

If "disabling a feature" is just about preventing some CQL from 
execution along with a warning log message, I'm fine with that. But if 
that's being the case, I don't really understand why making this change 
in a minor version would be a problem, since existing MVs wouldn't be 
affected anyways and should just work as before, even with the enabled 
flag set to false.

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Josh McKenzie
>
> Oh, come on. You're being disingenuous.

Not my intent. MV's (and SASI, for example) are fairly well isolated; we
have a history of other changes that are much more broadly and higher
impact risk-wise across the code-base.

If I were an operator and built a critical part of my business on a
released feature that developers then decided to default-disable as
'experimental' post-hoc, I'd think long and hard about using any new
features in that project in the future (and revisit my confidence in all
other features I relied on, and the software as a whole). We have users in
the wild relying on MV's with apparent success (same holds true of all the
other punching bags that have come up in this thread) and I'd hate to see
us alienate them by being over-aggressive in the way we handle this.

I'd much rather we continue to aggressively improve and continue to analyze
MV's stability before a 4.0 release and then use the experimental flag in
the future, if at all possible.

On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_...@belliottsmith.com>
wrote:

> Can't we promote these behavioural flags to keyspace properties (with
> suitable permissions to edit necessary)?
>
> I agree that enabling/disabling features shouldn't require a rolling
> restart, and nor should switching their consistency safety level.
>
> I think this would be the most suitable equivalent to ALLOW FILTERING for
> MVs.
>
>
>
> > On 4 Oct 2017, at 12:31, Jeremy Hanna 
> wrote:
> >
> > Not to detract from the discussion about whether or not to classify X or
> Y as experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303
>  was originally
> about operators preventing users from abusing features (e.g. allow
> filtering).  Could that concept be extended to features like MVs or SASI or
> anything else?  On the one hand it is nice to be able to set those things
> dynamically without a rolling restart as well as by user.  On the other
> it’s less clear about defaults.  There could be a property file or just in
> the yaml, the operator could specify the default features that are enabled
> for users and then it could be overridden within that framework.
> >
> >> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko 
> wrote:
> >>
> >> We already have those for UDFs and CDC.
> >>
> >> We should have more: for triggers, SASI, and MVs, at least. Operators
> need a way to disable features they haven’t validated.
> >>
> >> We already have sufficient consensus to introduce the flags, and we
> should. There also seems to be sufficient consensus on emitting warnings.
> >>
> >> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
> agree with Sylvain that flipping the default in a minor would be invasive.
> We shouldn’t do that.
> >>
> >> For trunk, though, I think we should default to off. When it comes to
> releasing 4.0 we can collectively decide if there is sufficient trust in
> MVs at the time to warrant flipping the default to true. Ultimately we can
> decide this in a PMC vote. If I misread the consensus regarding the default
> for 4.0, then we might as well vote on that. What I see is sufficient
> distrust coming from core committers, including the author of the v1
> design, to warrant opt-in for MVs.
> >>
> >> If we don’t trust in them as developers, we shouldn’t be cavalier with
> the users, either. Not until that trust is gained/regained.
> >>
> >> —
> >> AY
> >>
> >> On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org)
> wrote:
> >>
> >> Introducing feature flags for enabling or disabling different code paths
> >> is not sustainable in the long run. It's hard enough to keep up with
> >> integration testing with the couple of Jenkins jobs that we have.
> >> Running jobs for all permutations of flags that we keep around, would
> >> turn out impractical. But if we don't, I'm pretty sure something will
> >> fall off the radar and it won't take long until someone reports that
> >> enabling feature X after the latest upgrade will simply not work
> anymore.
> >>
> >> There may also be some more subtle assumptions and cross dependencies
> >> between features that may cause side effects by disabling a feature (or
> >> parts of it), even if it's just e.g. a metric value that suddenly won't
> >> get updated anymore, but is used somewhere else. We'll also have to
> >> consider migration paths for turning a feature on and off again without
> >> causing any downtime. If I was to turn on e.g. MVs on a single node in
> >> my cluster, then this should not cause any issues on the other nodes
> >> that still have MV code paths disabled. Again, this would need to be
> tested.
> >>
> >> So to be clear, my point is that any flags should be implemented in a
> >> really non-invasive way on the user facing side only, e.g. by emitting a
> >> log message or cqlsh error. At this point, I'm not really sure if it
> >> would be a good idea to add them to cassandra.yaml, as I'm pretty

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jon Haddad
So you’d rather continue to lie to users about the stability of the feature 
rather than admitting it was merged in prematurely?  I’d rather come clean and 
avoid future problems, and give people the opportunity to stop using MVs rather 
than let them keep taking risks they’re unaware of.  This is incredibly 
irresponsible in my opinion.  

> On Oct 4, 2017, at 11:26 AM, Josh McKenzie  wrote:
> 
>> 
>> Oh, come on. You're being disingenuous.
> 
> Not my intent. MV's (and SASI, for example) are fairly well isolated; we
> have a history of other changes that are much more broadly and higher
> impact risk-wise across the code-base.
> 
> If I were an operator and built a critical part of my business on a
> released feature that developers then decided to default-disable as
> 'experimental' post-hoc, I'd think long and hard about using any new
> features in that project in the future (and revisit my confidence in all
> other features I relied on, and the software as a whole). We have users in
> the wild relying on MV's with apparent success (same holds true of all the
> other punching bags that have come up in this thread) and I'd hate to see
> us alienate them by being over-aggressive in the way we handle this.
> 
> I'd much rather we continue to aggressively improve and continue to analyze
> MV's stability before a 4.0 release and then use the experimental flag in
> the future, if at all possible.
> 
> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith 
> <_...@belliottsmith.com>
> wrote:
> 
>> Can't we promote these behavioural flags to keyspace properties (with
>> suitable permissions to edit necessary)?
>> 
>> I agree that enabling/disabling features shouldn't require a rolling
>> restart, and nor should switching their consistency safety level.
>> 
>> I think this would be the most suitable equivalent to ALLOW FILTERING for
>> MVs.
>> 
>> 
>> 
>>> On 4 Oct 2017, at 12:31, Jeremy Hanna 
>> wrote:
>>> 
>>> Not to detract from the discussion about whether or not to classify X or
>> Y as experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303
>>  was originally
>> about operators preventing users from abusing features (e.g. allow
>> filtering).  Could that concept be extended to features like MVs or SASI or
>> anything else?  On the one hand it is nice to be able to set those things
>> dynamically without a rolling restart as well as by user.  On the other
>> it’s less clear about defaults.  There could be a property file or just in
>> the yaml, the operator could specify the default features that are enabled
>> for users and then it could be overridden within that framework.
>>> 
 On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko 
>> wrote:
 
 We already have those for UDFs and CDC.
 
 We should have more: for triggers, SASI, and MVs, at least. Operators
>> need a way to disable features they haven’t validated.
 
 We already have sufficient consensus to introduce the flags, and we
>> should. There also seems to be sufficient consensus on emitting warnings.
 
 The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
>> agree with Sylvain that flipping the default in a minor would be invasive.
>> We shouldn’t do that.
 
 For trunk, though, I think we should default to off. When it comes to
>> releasing 4.0 we can collectively decide if there is sufficient trust in
>> MVs at the time to warrant flipping the default to true. Ultimately we can
>> decide this in a PMC vote. If I misread the consensus regarding the default
>> for 4.0, then we might as well vote on that. What I see is sufficient
>> distrust coming from core committers, including the author of the v1
>> design, to warrant opt-in for MVs.
 
 If we don’t trust in them as developers, we shouldn’t be cavalier with
>> the users, either. Not until that trust is gained/regained.
 
 —
 AY
 
 On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org)
>> wrote:
 
 Introducing feature flags for enabling or disabling different code paths
 is not sustainable in the long run. It's hard enough to keep up with
 integration testing with the couple of Jenkins jobs that we have.
 Running jobs for all permutations of flags that we keep around, would
 turn out impractical. But if we don't, I'm pretty sure something will
 fall off the radar and it won't take long until someone reports that
 enabling feature X after the latest upgrade will simply not work
>> anymore.
 
 There may also be some more subtle assumptions and cross dependencies
 between features that may cause side effects by disabling a feature (or
 parts of it), even if it's just e.g. a metric value that suddenly won't
 get updated anymore, but is used somewhere else. We'll also have to
 consider migration paths for turning a feature on and off again without
 causing any downtime. If I was t

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Aleksey Yeshchenko
Strongly disagree with MV’s being isolated part.

You can feel the touch of the MVs in the read path, write path, metadata 
handling, whether you use them or not. And comparing any of those before/after 
MVs were introduced makes me sad every time I face any of it. It made our 
codebase objectively worse.

On 4 October 2017 at 19:26:43, Josh McKenzie (jmcken...@apache.org) wrote:

MV's (and SASI, for example) are fairly well isolated


Well, if the developers keep pushing untested complex features onto the 
project, then refuse to admit their mistakes,

then as an operator you *should* think long and hard and you *should* revisit 
your confidence. Or else you are a shitty operator.



On 4 October 2017 at 19:26:43, Josh McKenzie (jmcken...@apache.org) wrote:

If I were an operator and built a critical part of my business on a 
released feature that developers then decided to default-disable as 
'experimental' post-hoc, I'd think long and hard about using any new 
features in that project in the future (and revisit my confidence in all 
other features I relied on, and the software as a whole).


—

AY


Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Josh McKenzie
>
> So you’d rather continue to lie to users about the stability of the
> feature rather than admitting it was merged in prematurely?


Much like w/SASI, this is something that's in the code-base that for
> certain use-cases apparently works just fine.

I don't know of any outstanding issues with the feature,

There appear to be varying levels of understanding of the implementation
> details of MV's (that seem to directly correlate with faith in the
> feature's correctness for the use-cases recommended)

We have users in the wild relying on MV's with apparent success (same holds
> true of all the other punching bags that have come up in this thread)

You're right, Jon. That's clearly exactly what I'm saying.


On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad  wrote:

> So you’d rather continue to lie to users about the stability of the
> feature rather than admitting it was merged in prematurely?  I’d rather
> come clean and avoid future problems, and give people the opportunity to
> stop using MVs rather than let them keep taking risks they’re unaware of.
> This is incredibly irresponsible in my opinion.
>
> > On Oct 4, 2017, at 11:26 AM, Josh McKenzie  wrote:
> >
> >>
> >> Oh, come on. You're being disingenuous.
> >
> > Not my intent. MV's (and SASI, for example) are fairly well isolated; we
> > have a history of other changes that are much more broadly and higher
> > impact risk-wise across the code-base.
> >
> > If I were an operator and built a critical part of my business on a
> > released feature that developers then decided to default-disable as
> > 'experimental' post-hoc, I'd think long and hard about using any new
> > features in that project in the future (and revisit my confidence in all
> > other features I relied on, and the software as a whole). We have users
> in
> > the wild relying on MV's with apparent success (same holds true of all
> the
> > other punching bags that have come up in this thread) and I'd hate to see
> > us alienate them by being over-aggressive in the way we handle this.
> >
> > I'd much rather we continue to aggressively improve and continue to
> analyze
> > MV's stability before a 4.0 release and then use the experimental flag in
> > the future, if at all possible.
> >
> > On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
> belliottsmith.com>
> > wrote:
> >
> >> Can't we promote these behavioural flags to keyspace properties (with
> >> suitable permissions to edit necessary)?
> >>
> >> I agree that enabling/disabling features shouldn't require a rolling
> >> restart, and nor should switching their consistency safety level.
> >>
> >> I think this would be the most suitable equivalent to ALLOW FILTERING
> for
> >> MVs.
> >>
> >>
> >>
> >>> On 4 Oct 2017, at 12:31, Jeremy Hanna 
> >> wrote:
> >>>
> >>> Not to detract from the discussion about whether or not to classify X
> or
> >> Y as experimental but https://issues.apache.org/
> jira/browse/CASSANDRA-8303
> >>  was originally
> >> about operators preventing users from abusing features (e.g. allow
> >> filtering).  Could that concept be extended to features like MVs or
> SASI or
> >> anything else?  On the one hand it is nice to be able to set those
> things
> >> dynamically without a rolling restart as well as by user.  On the other
> >> it’s less clear about defaults.  There could be a property file or just
> in
> >> the yaml, the operator could specify the default features that are
> enabled
> >> for users and then it could be overridden within that framework.
> >>>
>  On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko 
> >> wrote:
> 
>  We already have those for UDFs and CDC.
> 
>  We should have more: for triggers, SASI, and MVs, at least. Operators
> >> need a way to disable features they haven’t validated.
> 
>  We already have sufficient consensus to introduce the flags, and we
> >> should. There also seems to be sufficient consensus on emitting
> warnings.
> 
>  The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
> >> agree with Sylvain that flipping the default in a minor would be
> invasive.
> >> We shouldn’t do that.
> 
>  For trunk, though, I think we should default to off. When it comes to
> >> releasing 4.0 we can collectively decide if there is sufficient trust in
> >> MVs at the time to warrant flipping the default to true. Ultimately we
> can
> >> decide this in a PMC vote. If I misread the consensus regarding the
> default
> >> for 4.0, then we might as well vote on that. What I see is sufficient
> >> distrust coming from core committers, including the author of the v1
> >> design, to warrant opt-in for MVs.
> 
>  If we don’t trust in them as developers, we shouldn’t be cavalier with
> >> the users, either. Not until that trust is gained/regained.
> 
>  —
>  AY
> 
>  On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org)
> >> wrote:
> 
>  I

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jon Haddad
MVs work fine for *some use cases*, not the general use case.  That’s why there 
should be a flag.  To opt into the feature when the behavior is only known to 
be correct under a certain set of circumstances.  Nobody is saying the flag 
should be “enable_terrible_feature_nobody_tested_and_we_all_hate”, or something 
ridiculous like that.  It’s not an attack against the work done by anyone, the 
level of effort put in, or minimizing the complexity of the problem.  
“enable_materialized_views” would be just fine.

We should be honest to people about what they’re getting into.  You may not be 
aware of this, but a lot of people still believe Cassandra isn’t a DB that you 
should put in prod.  It’s because features like SASI, MVs,  or incremental 
repair get merged in prematurely (or even made the default), without having 
been thoroughly tested, understood and vetted by trusted community members.  
New users hit the snags because they deploy the bleeding edge code and hit the 
bugs. 

That’s not how the process should work.  

Ideally, we’d follow a process that looks a lot more like this:

1. New feature is built with an opt in flag.  Unknowns are documented, the risk 
of using the feature is known to the end user.  
2. People test and use the feature that know what they’re doing.  They are able 
to read the code, submit patches, and help flush out the issues.  They do so in 
low risk environments.  In the case of MVs, they can afford to drop and rebuild 
the view over a week, or rebuild the cluster altogether.  We may not even need 
to worry as much about backwards compatibility.
3. The feature matures.  More tests are written.  More people become aware of 
how to contribute to the feature’s stability.
4. After a while, we vote on removing the feature flag and declare it stable 
for general usage.

If nobody actually cares about a feature (why it was it written in the first 
place?), then it would never get to 2, 3, 4.  It would take a while for big 
features like MVs to be marked stable, and that’s fine, because it takes a long 
time to actually stabilize them.  I think we can all agree they are really, 
really hard problems to solve, and maybe it takes a while.

Jon



> On Oct 4, 2017, at 11:44 AM, Josh McKenzie  wrote:
> 
>> 
>> So you’d rather continue to lie to users about the stability of the
>> feature rather than admitting it was merged in prematurely?
> 
> 
> Much like w/SASI, this is something that's in the code-base that for
>> certain use-cases apparently works just fine.
> 
> I don't know of any outstanding issues with the feature,
> 
> There appear to be varying levels of understanding of the implementation
>> details of MV's (that seem to directly correlate with faith in the
>> feature's correctness for the use-cases recommended)
> 
> We have users in the wild relying on MV's with apparent success (same holds
>> true of all the other punching bags that have come up in this thread)
> 
> You're right, Jon. That's clearly exactly what I'm saying.
> 
> 
> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad  wrote:
> 
>> So you’d rather continue to lie to users about the stability of the
>> feature rather than admitting it was merged in prematurely?  I’d rather
>> come clean and avoid future problems, and give people the opportunity to
>> stop using MVs rather than let them keep taking risks they’re unaware of.
>> This is incredibly irresponsible in my opinion.
>> 
>>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie  wrote:
>>> 
 
 Oh, come on. You're being disingenuous.
>>> 
>>> Not my intent. MV's (and SASI, for example) are fairly well isolated; we
>>> have a history of other changes that are much more broadly and higher
>>> impact risk-wise across the code-base.
>>> 
>>> If I were an operator and built a critical part of my business on a
>>> released feature that developers then decided to default-disable as
>>> 'experimental' post-hoc, I'd think long and hard about using any new
>>> features in that project in the future (and revisit my confidence in all
>>> other features I relied on, and the software as a whole). We have users
>> in
>>> the wild relying on MV's with apparent success (same holds true of all
>> the
>>> other punching bags that have come up in this thread) and I'd hate to see
>>> us alienate them by being over-aggressive in the way we handle this.
>>> 
>>> I'd much rather we continue to aggressively improve and continue to
>> analyze
>>> MV's stability before a 4.0 release and then use the experimental flag in
>>> the future, if at all possible.
>>> 
>>> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
>> belliottsmith.com>
>>> wrote:
>>> 
 Can't we promote these behavioural flags to keyspace properties (with
 suitable permissions to edit necessary)?
 
 I agree that enabling/disabling features shouldn't require a rolling
 restart, and nor should switching their consistency safety level.
 
 I think this would be the most suitable equivalent to A

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Pavel Yaskevich
On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad  wrote:

> MVs work fine for *some use cases*, not the general use case.  That’s why
> there should be a flag.  To opt into the feature when the behavior is only
> known to be correct under a certain set of circumstances.  Nobody is saying
> the flag should be “enable_terrible_feature_nobody_tested_and_we_all_hate”,
> or something ridiculous like that.  It’s not an attack against the work
> done by anyone, the level of effort put in, or minimizing the complexity of
> the problem.  “enable_materialized_views” would be just fine.
>
> We should be honest to people about what they’re getting into.  You may
> not be aware of this, but a lot of people still believe Cassandra isn’t a
> DB that you should put in prod.  It’s because features like SASI, MVs,  or
> incremental repair get merged in prematurely (or even made the default),
> without having been thoroughly tested, understood and vetted by trusted
> community members.  New users hit the snags because they deploy the
> bleeding edge code and hit the bugs.
>

I beg to differ in case of SASI, it has been tested and vetted and ported
to different versions. I'm pretty sure it still has better test coverage
then most of the project does, it's not a "default" and you actually have
to opt-in to it by creating a custom index, how is that premature or
misleading to users?


>
> That’s not how the process should work.
>
> Ideally, we’d follow a process that looks a lot more like this:
>
> 1. New feature is built with an opt in flag.  Unknowns are documented, the
> risk of using the feature is known to the end user.
> 2. People test and use the feature that know what they’re doing.  They are
> able to read the code, submit patches, and help flush out the issues.  They
> do so in low risk environments.  In the case of MVs, they can afford to
> drop and rebuild the view over a week, or rebuild the cluster altogether.
> We may not even need to worry as much about backwards compatibility.
> 3. The feature matures.  More tests are written.  More people become aware
> of how to contribute to the feature’s stability.
> 4. After a while, we vote on removing the feature flag and declare it
> stable for general usage.
>
> If nobody actually cares about a feature (why it was it written in the
> first place?), then it would never get to 2, 3, 4.  It would take a while
> for big features like MVs to be marked stable, and that’s fine, because it
> takes a long time to actually stabilize them.  I think we can all agree
> they are really, really hard problems to solve, and maybe it takes a while.
>
> Jon
>
>
>
> > On Oct 4, 2017, at 11:44 AM, Josh McKenzie  wrote:
> >
> >>
> >> So you’d rather continue to lie to users about the stability of the
> >> feature rather than admitting it was merged in prematurely?
> >
> >
> > Much like w/SASI, this is something that's in the code-base that for
> >> certain use-cases apparently works just fine.
> >
> > I don't know of any outstanding issues with the feature,
> >
> > There appear to be varying levels of understanding of the implementation
> >> details of MV's (that seem to directly correlate with faith in the
> >> feature's correctness for the use-cases recommended)
> >
> > We have users in the wild relying on MV's with apparent success (same
> holds
> >> true of all the other punching bags that have come up in this thread)
> >
> > You're right, Jon. That's clearly exactly what I'm saying.
> >
> >
> > On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad  wrote:
> >
> >> So you’d rather continue to lie to users about the stability of the
> >> feature rather than admitting it was merged in prematurely?  I’d rather
> >> come clean and avoid future problems, and give people the opportunity to
> >> stop using MVs rather than let them keep taking risks they’re unaware
> of.
> >> This is incredibly irresponsible in my opinion.
> >>
> >>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie 
> wrote:
> >>>
> 
>  Oh, come on. You're being disingenuous.
> >>>
> >>> Not my intent. MV's (and SASI, for example) are fairly well isolated;
> we
> >>> have a history of other changes that are much more broadly and higher
> >>> impact risk-wise across the code-base.
> >>>
> >>> If I were an operator and built a critical part of my business on a
> >>> released feature that developers then decided to default-disable as
> >>> 'experimental' post-hoc, I'd think long and hard about using any new
> >>> features in that project in the future (and revisit my confidence in
> all
> >>> other features I relied on, and the software as a whole). We have users
> >> in
> >>> the wild relying on MV's with apparent success (same holds true of all
> >> the
> >>> other punching bags that have come up in this thread) and I'd hate to
> see
> >>> us alienate them by being over-aggressive in the way we handle this.
> >>>
> >>> I'd much rather we continue to aggressively improve and continue to
> >> analyze
> >>> MV's stability before a 4.0 release an

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jon Haddad
The default part I was referring to incremental repair.

SASI still has a pretty fatal issue where nodes OOM: 
https://issues.apache.org/jira/browse/CASSANDRA-12662 
 



> On Oct 4, 2017, at 12:21 PM, Pavel Yaskevich  wrote:
> 
> On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad  > wrote:
> 
>> MVs work fine for *some use cases*, not the general use case.  That’s why
>> there should be a flag.  To opt into the feature when the behavior is only
>> known to be correct under a certain set of circumstances.  Nobody is saying
>> the flag should be “enable_terrible_feature_nobody_tested_and_we_all_hate”,
>> or something ridiculous like that.  It’s not an attack against the work
>> done by anyone, the level of effort put in, or minimizing the complexity of
>> the problem.  “enable_materialized_views” would be just fine.
>> 
>> We should be honest to people about what they’re getting into.  You may
>> not be aware of this, but a lot of people still believe Cassandra isn’t a
>> DB that you should put in prod.  It’s because features like SASI, MVs,  or
>> incremental repair get merged in prematurely (or even made the default),
>> without having been thoroughly tested, understood and vetted by trusted
>> community members.  New users hit the snags because they deploy the
>> bleeding edge code and hit the bugs.
>> 
> 
> I beg to differ in case of SASI, it has been tested and vetted and ported
> to different versions. I'm pretty sure it still has better test coverage
> then most of the project does, it's not a "default" and you actually have
> to opt-in to it by creating a custom index, how is that premature or
> misleading to users?
> 
> 
>> 
>> That’s not how the process should work.
>> 
>> Ideally, we’d follow a process that looks a lot more like this:
>> 
>> 1. New feature is built with an opt in flag.  Unknowns are documented, the
>> risk of using the feature is known to the end user.
>> 2. People test and use the feature that know what they’re doing.  They are
>> able to read the code, submit patches, and help flush out the issues.  They
>> do so in low risk environments.  In the case of MVs, they can afford to
>> drop and rebuild the view over a week, or rebuild the cluster altogether.
>> We may not even need to worry as much about backwards compatibility.
>> 3. The feature matures.  More tests are written.  More people become aware
>> of how to contribute to the feature’s stability.
>> 4. After a while, we vote on removing the feature flag and declare it
>> stable for general usage.
>> 
>> If nobody actually cares about a feature (why it was it written in the
>> first place?), then it would never get to 2, 3, 4.  It would take a while
>> for big features like MVs to be marked stable, and that’s fine, because it
>> takes a long time to actually stabilize them.  I think we can all agree
>> they are really, really hard problems to solve, and maybe it takes a while.
>> 
>> Jon
>> 
>> 
>> 
>>> On Oct 4, 2017, at 11:44 AM, Josh McKenzie  wrote:
>>> 
 
 So you’d rather continue to lie to users about the stability of the
 feature rather than admitting it was merged in prematurely?
>>> 
>>> 
>>> Much like w/SASI, this is something that's in the code-base that for
 certain use-cases apparently works just fine.
>>> 
>>> I don't know of any outstanding issues with the feature,
>>> 
>>> There appear to be varying levels of understanding of the implementation
 details of MV's (that seem to directly correlate with faith in the
 feature's correctness for the use-cases recommended)
>>> 
>>> We have users in the wild relying on MV's with apparent success (same
>> holds
 true of all the other punching bags that have come up in this thread)
>>> 
>>> You're right, Jon. That's clearly exactly what I'm saying.
>>> 
>>> 
>>> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad  wrote:
>>> 
 So you’d rather continue to lie to users about the stability of the
 feature rather than admitting it was merged in prematurely?  I’d rather
 come clean and avoid future problems, and give people the opportunity to
 stop using MVs rather than let them keep taking risks they’re unaware
>> of.
 This is incredibly irresponsible in my opinion.
 
> On Oct 4, 2017, at 11:26 AM, Josh McKenzie 
>> wrote:
> 
>> 
>> Oh, come on. You're being disingenuous.
> 
> Not my intent. MV's (and SASI, for example) are fairly well isolated;
>> we
> have a history of other changes that are much more broadly and higher
> impact risk-wise across the code-base.
> 
> If I were an operator and built a critical part of my business on a
> released feature that developers then decided to default-disable as
> 'experimental' post-hoc, I'd think long and hard about using any new
> features in that project in the future (and revisit my confidence in
>> all
> other features I relied on, and the software as a who

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Pavel Yaskevich
On Wed, Oct 4, 2017 at 12:23 PM, Jon Haddad  wrote:

> The default part I was referring to incremental repair.
>
> SASI still has a pretty fatal issue where nodes OOM:
> https://issues.apache.org/jira/browse/CASSANDRA-12662 <
> https://issues.apache.org/jira/browse/CASSANDRA-12662>
>

If you read the comments in the issue originator of the problem states
that "Cassandra fairly quickly crashes with OOM, a glance over hprof shows
4Gb of PartitionUpdates." which to me doesn't seem like it's a SASI issue
but more of the issue of underlaying storage which SASI uses.



>
>
> > On Oct 4, 2017, at 12:21 PM, Pavel Yaskevich  wrote:
> >
> > On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad  j...@jonhaddad.com>> wrote:
> >
> >> MVs work fine for *some use cases*, not the general use case.  That’s
> why
> >> there should be a flag.  To opt into the feature when the behavior is
> only
> >> known to be correct under a certain set of circumstances.  Nobody is
> saying
> >> the flag should be “enable_terrible_feature_
> nobody_tested_and_we_all_hate”,
> >> or something ridiculous like that.  It’s not an attack against the work
> >> done by anyone, the level of effort put in, or minimizing the
> complexity of
> >> the problem.  “enable_materialized_views” would be just fine.
> >>
> >> We should be honest to people about what they’re getting into.  You may
> >> not be aware of this, but a lot of people still believe Cassandra isn’t
> a
> >> DB that you should put in prod.  It’s because features like SASI, MVs,
> or
> >> incremental repair get merged in prematurely (or even made the default),
> >> without having been thoroughly tested, understood and vetted by trusted
> >> community members.  New users hit the snags because they deploy the
> >> bleeding edge code and hit the bugs.
> >>
> >
> > I beg to differ in case of SASI, it has been tested and vetted and ported
> > to different versions. I'm pretty sure it still has better test coverage
> > then most of the project does, it's not a "default" and you actually have
> > to opt-in to it by creating a custom index, how is that premature or
> > misleading to users?
> >
> >
> >>
> >> That’s not how the process should work.
> >>
> >> Ideally, we’d follow a process that looks a lot more like this:
> >>
> >> 1. New feature is built with an opt in flag.  Unknowns are documented,
> the
> >> risk of using the feature is known to the end user.
> >> 2. People test and use the feature that know what they’re doing.  They
> are
> >> able to read the code, submit patches, and help flush out the issues.
> They
> >> do so in low risk environments.  In the case of MVs, they can afford to
> >> drop and rebuild the view over a week, or rebuild the cluster
> altogether.
> >> We may not even need to worry as much about backwards compatibility.
> >> 3. The feature matures.  More tests are written.  More people become
> aware
> >> of how to contribute to the feature’s stability.
> >> 4. After a while, we vote on removing the feature flag and declare it
> >> stable for general usage.
> >>
> >> If nobody actually cares about a feature (why it was it written in the
> >> first place?), then it would never get to 2, 3, 4.  It would take a
> while
> >> for big features like MVs to be marked stable, and that’s fine, because
> it
> >> takes a long time to actually stabilize them.  I think we can all agree
> >> they are really, really hard problems to solve, and maybe it takes a
> while.
> >>
> >> Jon
> >>
> >>
> >>
> >>> On Oct 4, 2017, at 11:44 AM, Josh McKenzie 
> wrote:
> >>>
> 
>  So you’d rather continue to lie to users about the stability of the
>  feature rather than admitting it was merged in prematurely?
> >>>
> >>>
> >>> Much like w/SASI, this is something that's in the code-base that for
>  certain use-cases apparently works just fine.
> >>>
> >>> I don't know of any outstanding issues with the feature,
> >>>
> >>> There appear to be varying levels of understanding of the
> implementation
>  details of MV's (that seem to directly correlate with faith in the
>  feature's correctness for the use-cases recommended)
> >>>
> >>> We have users in the wild relying on MV's with apparent success (same
> >> holds
>  true of all the other punching bags that have come up in this thread)
> >>>
> >>> You're right, Jon. That's clearly exactly what I'm saying.
> >>>
> >>>
> >>> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad  wrote:
> >>>
>  So you’d rather continue to lie to users about the stability of the
>  feature rather than admitting it was merged in prematurely?  I’d
> rather
>  come clean and avoid future problems, and give people the opportunity
> to
>  stop using MVs rather than let them keep taking risks they’re unaware
> >> of.
>  This is incredibly irresponsible in my opinion.
> 
> > On Oct 4, 2017, at 11:26 AM, Josh McKenzie 
> >> wrote:
> >
> >>
> >> Oh, come on. You're being disingenuous.
> >
> > Not my intent. MV's (and SASI, f

Re: CREATE INDEX without IF NOT EXISTS when snapshoting

2017-10-04 Thread Javier Canillas
Kurt,

Thanks for your response. Created this ticket
. Feel free to add
anything to it that seems legit.

Downloading Cassandra code right now.

Fix seems quite simple. Expect a pull-request soon xD

2017-10-03 20:19 GMT-03:00 kurt greaves :

> Certainly would make sense and should be trivial.  here
> 
>  is
> where you want to look. Just create a ticket for it and prod here for a
> reviewer once you've got a change.​
>


Cassandra pluggable storage engine (update)

2017-10-04 Thread Dikang Gu
Hello C* developers:

In my previous email (
https://www.mail-archive.com/dev@cassandra.apache.org/msg11024.html), I
presented that Instagram was kicking off a project to make C*'s storage
engine to be pluggable, as other modern databases, like mysql, mongoDB etc,
so that users will be able to choose most suitable storage engine for
different work load, or to use different features. In addition to that, a
pluggable storage engine architecture will improve the modularity of the
system, help to increase the testability and reliability of Cassandra.

After months of development and testing, we'd like to share the work we
have done, including the first(draft) version of the C* storage engine API,
and the first version of the RocksDB based storage engine.

​


For the C* storage engine API, here is the draft version we proposed,
https://docs.google.com/document/d/1PxYm9oXW2jJtSDiZ-SR9O20jud_0jnA-mW7ttp2dVmk/edit.
It contains the APIs for read/write requests, streaming, and table
management. The storage engine related functionalities, like data
encoding/decoding format, on-disk data read/write, compaction, etc, will be
taken care by the storage engine implementation.

Each storage engine is a class with each instance of the class is stored in
the Keyspace instance. So all the column families within a keyspace will
share one storage engine instance.

Once a storage engine instance is created, Cassandra sever issues commands
to the engine instance to performance data storage and retrieval tasks such
as opening a column family, managing column families and streaming.

How to config storage engine for different keyspaces? It's still open for
discussion. One proposal is that we can add the storage engine option in
the create keyspace cql command, and potentially we can overwrite the
option per C* node in its config file.

Under that API, we implemented a new storage engine, based on RocksDB,
called RocksEngine. In long term, we want to support most of C* existing
features in RocksEngine, and we want to build it in a progressive manner.
For the first version of the RocksDBEngine, we support following features:

   - Most of non-nested data types
   - Table schema
   - Point query
   - Range query
   - Mutations
   - Timestamp
   - TTL
   - Deletions/Cell tombstones
   - Streaming

We do not supported following features in first version yet:

   - Multi-partition query
   - Nested data types
   - Counters
   - Range tombstone
   - Materialized views
   - Secondary indexes
   - SASI
   - Repair

At this moment, we've implemented the V1 features, and deployed it to our
shadow cluster. Using shadowing traffic of our production use cases, we saw
~3X P99 read latency drop, compared to our C* 2.2 prod clusters. Here are
some detailed metrics:
https://docs.google.com/document/d/1DojHPteDPSphO0_N2meZ3zkmqlidRwwe_cJpsXLcp10.


So if you need the features in existing storage engine, please keep using
the existing storage engine. If you want to have a more predictable and
lower read latency, also the features supported by RocksEngine are enough
for your use cases, then RocksEngine could be a fit for you.

The work is 1% finished, and we want to work together with community to
make it happen. We presented the work in NGCC last week, and also pushed
the beta version of the pluggable storage engine to Instagram github
Cassandra repo, rocks_3.0 branch (
https://github.com/Instagram/cassandra/tree/rocks_3.0), which is based on
C* 3.0.12, please feel free to play with it! You can download it and follow
the instructions (
https://github.com/Instagram/cassandra/blob/rocks_3.0/StorageEngine.md) to
try it out in your test environment, your feedback will be very valuable to
us.

Thanks
Dikang.


Re: Cassandra pluggable storage engine (update)

2017-10-04 Thread Blake Eggleston
Hi Dikang,

Cool stuff. 2 questions. Based on your presentation at ngcc, it seems like 
rocks db stores things in byte order. Does this mean that you have code that 
makes each of the existing types byte comparable, or is clustering order 
implementation dependent? Also, I don't see anything in the draft api that 
seems to support splitting the data set into arbitrary categories (ie repaired 
and unrepaired data living in the same token range). Is support for incremental 
repair planned for v1?

Thanks,

Blake


On October 4, 2017 at 1:28:01 PM, Dikang Gu (dikan...@gmail.com) wrote:

Hello C* developers: 

In my previous email 
(https://www.mail-archive.com/dev@cassandra.apache.org/msg11024.html), I 
presented that Instagram was kicking off a project to make C*'s storage engine 
to be pluggable, as other modern databases, like mysql, mongoDB etc, so that 
users will be able to choose most suitable storage engine for different work 
load, or to use different features. In addition to that, a pluggable storage 
engine architecture will improve the modularity of the system, help to increase 
the testability and reliability of Cassandra.

After months of development and testing, we'd like to share the work we have 
done, including the first(draft) version of the C* storage engine API, and the 
first version of the RocksDB based storage engine.

​


For the C* storage engine API, here is the draft version we proposed, 
https://docs.google.com/document/d/1PxYm9oXW2jJtSDiZ-SR9O20jud_0jnA-mW7ttp2dVmk/edit.
 It contains the APIs for read/write requests, streaming, and table management. 
The storage engine related functionalities, like data encoding/decoding format, 
on-disk data read/write, compaction, etc, will be taken care by the storage 
engine implementation.

Each storage engine is a class with each instance of the class is stored in the 
Keyspace instance. So all the column families within a keyspace will share one 
storage engine instance.

Once a storage engine instance is created, Cassandra sever issues commands to 
the engine instance to performance data storage and retrieval tasks such as 
opening a column family, managing column families and streaming.

How to config storage engine for different keyspaces? It's still open for 
discussion. One proposal is that we can add the storage engine option in the 
create keyspace cql command, and potentially we can overwrite the option per C* 
node in its config file.

Under that API, we implemented a new storage engine, based on RocksDB, called 
RocksEngine. In long term, we want to support most of C* existing features in 
RocksEngine, and we want to build it in a progressive manner. For the first 
version of the RocksDBEngine, we support following features:
Most of non-nested data types
Table schema
Point query
Range query
Mutations
Timestamp
TTL
Deletions/Cell tombstones
Streaming
We do not supported following features in first version yet:
Multi-partition query
Nested data types
Counters
Range tombstone
Materialized views
Secondary indexes
SASI
Repair
At this moment, we've implemented the V1 features, and deployed it to our 
shadow cluster. Using shadowing traffic of our production use cases, we saw ~3X 
P99 read latency drop, compared to our C* 2.2 prod clusters. Here are some 
detailed metrics: 
https://docs.google.com/document/d/1DojHPteDPSphO0_N2meZ3zkmqlidRwwe_cJpsXLcp10.

So if you need the features in existing storage engine, please keep using the 
existing storage engine. If you want to have a more predictable and lower read 
latency, also the features supported by RocksEngine are enough for your use 
cases, then RocksEngine could be a fit for you.

The work is 1% finished, and we want to work together with community to make it 
happen. We presented the work in NGCC last week, and also pushed the beta 
version of the pluggable storage engine to Instagram github Cassandra repo, 
rocks_3.0 branch (https://github.com/Instagram/cassandra/tree/rocks_3.0), which 
is based on C* 3.0.12, please feel free to play with it! You can download it 
and follow the instructions 
(https://github.com/Instagram/cassandra/blob/rocks_3.0/StorageEngine.md) to try 
it out in your test environment, your feedback will be very valuable to us.

Thanks
Dikang.



Re: Cassandra pluggable storage engine (update)

2017-10-04 Thread Dikang Gu
Hi Blake,

Great questions!

1. Yeah, we implement the encoding algorithms, which could encode C* data
types into byte array, and keep the same sorting order. Our implementation
is based on the orderly lib used in HBase,
https://github.com/ndimiduk/orderly .
2. Repair is not supported yet, we are still working on figure out the work
need to be done to support repair or incremental repair.

Thanks
Dikang.

On Wed, Oct 4, 2017 at 1:39 PM, Blake Eggleston 
wrote:

> Hi Dikang,
>
> Cool stuff. 2 questions. Based on your presentation at ngcc, it seems like
> rocks db stores things in byte order. Does this mean that you have code
> that makes each of the existing types byte comparable, or is clustering
> order implementation dependent? Also, I don't see anything in the draft api
> that seems to support splitting the data set into arbitrary categories (ie
> repaired and unrepaired data living in the same token range). Is support
> for incremental repair planned for v1?
>
> Thanks,
>
> Blake
>
>
> On October 4, 2017 at 1:28:01 PM, Dikang Gu (dikan...@gmail.com) wrote:
>
> Hello C* developers:
>
> In my previous email (https://www.mail-archive.com/
> dev@cassandra.apache.org/msg11024.html), I presented that Instagram was
> kicking off a project to make C*'s storage engine to be pluggable, as other
> modern databases, like mysql, mongoDB etc, so that users will be able to
> choose most suitable storage engine for different work load, or to use
> different features. In addition to that, a pluggable storage engine
> architecture will improve the modularity of the system, help to increase
> the testability and reliability of Cassandra.
>
> After months of development and testing, we'd like to share the work we
> have done, including the first(draft) version of the C* storage engine API,
> and the first version of the RocksDB based storage engine.
>
> ​
>
>
> For the C* storage engine API, here is the draft version we proposed,
> https://docs.google.com/document/d/1PxYm9oXW2jJtSDiZ-
> SR9O20jud_0jnA-mW7ttp2dVmk/edit. It contains the APIs for read/write
> requests, streaming, and table management. The storage engine related
> functionalities, like data encoding/decoding format, on-disk data
> read/write, compaction, etc, will be taken care by the storage engine
> implementation.
>
> Each storage engine is a class with each instance of the class is stored
> in the Keyspace instance. So all the column families within a keyspace will
> share one storage engine instance.
>
> Once a storage engine instance is created, Cassandra sever issues commands
> to the engine instance to performance data storage and retrieval tasks such
> as opening a column family, managing column families and streaming.
>
> How to config storage engine for different keyspaces? It's still open for
> discussion. One proposal is that we can add the storage engine option in
> the create keyspace cql command, and potentially we can overwrite the
> option per C* node in its config file.
>
> Under that API, we implemented a new storage engine, based on RocksDB,
> called RocksEngine. In long term, we want to support most of C* existing
> features in RocksEngine, and we want to build it in a progressive manner.
> For the first version of the RocksDBEngine, we support following features:
> Most of non-nested data types
> Table schema
> Point query
> Range query
> Mutations
> Timestamp
> TTL
> Deletions/Cell tombstones
> Streaming
> We do not supported following features in first version yet:
> Multi-partition query
> Nested data types
> Counters
> Range tombstone
> Materialized views
> Secondary indexes
> SASI
> Repair
> At this moment, we've implemented the V1 features, and deployed it to our
> shadow cluster. Using shadowing traffic of our production use cases, we saw
> ~3X P99 read latency drop, compared to our C* 2.2 prod clusters. Here are
> some detailed metrics: https://docs.google.com/document/d/1DojHPteDPSphO0_
> N2meZ3zkmqlidRwwe_cJpsXLcp10.
>
> So if you need the features in existing storage engine, please keep using
> the existing storage engine. If you want to have a more predictable and
> lower read latency, also the features supported by RocksEngine are enough
> for your use cases, then RocksEngine could be a fit for you.
>
> The work is 1% finished, and we want to work together with community to
> make it happen. We presented the work in NGCC last week, and also pushed
> the beta version of the pluggable storage engine to Instagram github
> Cassandra repo, rocks_3.0 branch (https://github.com/Instagram/
> cassandra/tree/rocks_3.0), which is based on C* 3.0.12, please feel free
> to play with it! You can download it and follow the instructions (
> https://github.com/Instagram/cassandra/blob/rocks_3.0/StorageEngine.md)
> to try it out in your test environment, your feedback will be very valuable
> to us.
>
> Thanks
> Dikang.
>
>


-- 
Dikang


Re: Cassandra pluggable storage engine (update)

2017-10-04 Thread DuyHai Doan
Excellent docs, thanks for the update Dikang.

A question about a design choice, is there any technical reason to specify
the storage engine at keyspace level rather than table level ?

It's not overly complicated to move all tables sharing the same storage
engine into the same keyspace but then it makes tables organization
strongly tied to technical storage engine choice rather than functional
splitting

Regards

On Wed, Oct 4, 2017 at 10:47 PM, Dikang Gu  wrote:

> Hi Blake,
>
> Great questions!
>
> 1. Yeah, we implement the encoding algorithms, which could encode C* data
> types into byte array, and keep the same sorting order. Our implementation
> is based on the orderly lib used in HBase,
> https://github.com/ndimiduk/orderly .
> 2. Repair is not supported yet, we are still working on figure out the work
> need to be done to support repair or incremental repair.
>
> Thanks
> Dikang.
>
> On Wed, Oct 4, 2017 at 1:39 PM, Blake Eggleston 
> wrote:
>
> > Hi Dikang,
> >
> > Cool stuff. 2 questions. Based on your presentation at ngcc, it seems
> like
> > rocks db stores things in byte order. Does this mean that you have code
> > that makes each of the existing types byte comparable, or is clustering
> > order implementation dependent? Also, I don't see anything in the draft
> api
> > that seems to support splitting the data set into arbitrary categories
> (ie
> > repaired and unrepaired data living in the same token range). Is support
> > for incremental repair planned for v1?
> >
> > Thanks,
> >
> > Blake
> >
> >
> > On October 4, 2017 at 1:28:01 PM, Dikang Gu (dikan...@gmail.com) wrote:
> >
> > Hello C* developers:
> >
> > In my previous email (https://www.mail-archive.com/
> > dev@cassandra.apache.org/msg11024.html), I presented that Instagram was
> > kicking off a project to make C*'s storage engine to be pluggable, as
> other
> > modern databases, like mysql, mongoDB etc, so that users will be able to
> > choose most suitable storage engine for different work load, or to use
> > different features. In addition to that, a pluggable storage engine
> > architecture will improve the modularity of the system, help to increase
> > the testability and reliability of Cassandra.
> >
> > After months of development and testing, we'd like to share the work we
> > have done, including the first(draft) version of the C* storage engine
> API,
> > and the first version of the RocksDB based storage engine.
> >
> > ​
> >
> >
> > For the C* storage engine API, here is the draft version we proposed,
> > https://docs.google.com/document/d/1PxYm9oXW2jJtSDiZ-
> > SR9O20jud_0jnA-mW7ttp2dVmk/edit. It contains the APIs for read/write
> > requests, streaming, and table management. The storage engine related
> > functionalities, like data encoding/decoding format, on-disk data
> > read/write, compaction, etc, will be taken care by the storage engine
> > implementation.
> >
> > Each storage engine is a class with each instance of the class is stored
> > in the Keyspace instance. So all the column families within a keyspace
> will
> > share one storage engine instance.
> >
> > Once a storage engine instance is created, Cassandra sever issues
> commands
> > to the engine instance to performance data storage and retrieval tasks
> such
> > as opening a column family, managing column families and streaming.
> >
> > How to config storage engine for different keyspaces? It's still open for
> > discussion. One proposal is that we can add the storage engine option in
> > the create keyspace cql command, and potentially we can overwrite the
> > option per C* node in its config file.
> >
> > Under that API, we implemented a new storage engine, based on RocksDB,
> > called RocksEngine. In long term, we want to support most of C* existing
> > features in RocksEngine, and we want to build it in a progressive manner.
> > For the first version of the RocksDBEngine, we support following
> features:
> > Most of non-nested data types
> > Table schema
> > Point query
> > Range query
> > Mutations
> > Timestamp
> > TTL
> > Deletions/Cell tombstones
> > Streaming
> > We do not supported following features in first version yet:
> > Multi-partition query
> > Nested data types
> > Counters
> > Range tombstone
> > Materialized views
> > Secondary indexes
> > SASI
> > Repair
> > At this moment, we've implemented the V1 features, and deployed it to our
> > shadow cluster. Using shadowing traffic of our production use cases, we
> saw
> > ~3X P99 read latency drop, compared to our C* 2.2 prod clusters. Here are
> > some detailed metrics: https://docs.google.com/
> document/d/1DojHPteDPSphO0_
> > N2meZ3zkmqlidRwwe_cJpsXLcp10.
> >
> > So if you need the features in existing storage engine, please keep using
> > the existing storage engine. If you want to have a more predictable and
> > lower read latency, also the features supported by RocksEngine are enough
> > for your use cases, then RocksEngine could be a fit for you.
> >
> > The work is 1% finished, and we 

Re: Cassandra pluggable storage engine (update)

2017-10-04 Thread Dikang Gu
Hi DuyHai,

Good point! At this moment, I do not see anything really prevent us from
having one storage engine type per table, we are using one RocksDB instance
per table anyway. However, we want to do simple things first, and it's
easier for us to have storage engine per keyspace, for both development and
our internal deployment. We can revisit the choice if there are strong
needs for storage engine per table.

Thanks
Dikang.

On Wed, Oct 4, 2017 at 1:54 PM, DuyHai Doan  wrote:

> Excellent docs, thanks for the update Dikang.
>
> A question about a design choice, is there any technical reason to specify
> the storage engine at keyspace level rather than table level ?
>
> It's not overly complicated to move all tables sharing the same storage
> engine into the same keyspace but then it makes tables organization
> strongly tied to technical storage engine choice rather than functional
> splitting
>
> Regards
>
> On Wed, Oct 4, 2017 at 10:47 PM, Dikang Gu  wrote:
>
> > Hi Blake,
> >
> > Great questions!
> >
> > 1. Yeah, we implement the encoding algorithms, which could encode C* data
> > types into byte array, and keep the same sorting order. Our
> implementation
> > is based on the orderly lib used in HBase,
> > https://github.com/ndimiduk/orderly .
> > 2. Repair is not supported yet, we are still working on figure out the
> work
> > need to be done to support repair or incremental repair.
> >
> > Thanks
> > Dikang.
> >
> > On Wed, Oct 4, 2017 at 1:39 PM, Blake Eggleston 
> > wrote:
> >
> > > Hi Dikang,
> > >
> > > Cool stuff. 2 questions. Based on your presentation at ngcc, it seems
> > like
> > > rocks db stores things in byte order. Does this mean that you have code
> > > that makes each of the existing types byte comparable, or is clustering
> > > order implementation dependent? Also, I don't see anything in the draft
> > api
> > > that seems to support splitting the data set into arbitrary categories
> > (ie
> > > repaired and unrepaired data living in the same token range). Is
> support
> > > for incremental repair planned for v1?
> > >
> > > Thanks,
> > >
> > > Blake
> > >
> > >
> > > On October 4, 2017 at 1:28:01 PM, Dikang Gu (dikan...@gmail.com)
> wrote:
> > >
> > > Hello C* developers:
> > >
> > > In my previous email (https://www.mail-archive.com/
> > > dev@cassandra.apache.org/msg11024.html), I presented that Instagram
> was
> > > kicking off a project to make C*'s storage engine to be pluggable, as
> > other
> > > modern databases, like mysql, mongoDB etc, so that users will be able
> to
> > > choose most suitable storage engine for different work load, or to use
> > > different features. In addition to that, a pluggable storage engine
> > > architecture will improve the modularity of the system, help to
> increase
> > > the testability and reliability of Cassandra.
> > >
> > > After months of development and testing, we'd like to share the work we
> > > have done, including the first(draft) version of the C* storage engine
> > API,
> > > and the first version of the RocksDB based storage engine.
> > >
> > > ​
> > >
> > >
> > > For the C* storage engine API, here is the draft version we proposed,
> > > https://docs.google.com/document/d/1PxYm9oXW2jJtSDiZ-
> > > SR9O20jud_0jnA-mW7ttp2dVmk/edit. It contains the APIs for read/write
> > > requests, streaming, and table management. The storage engine related
> > > functionalities, like data encoding/decoding format, on-disk data
> > > read/write, compaction, etc, will be taken care by the storage engine
> > > implementation.
> > >
> > > Each storage engine is a class with each instance of the class is
> stored
> > > in the Keyspace instance. So all the column families within a keyspace
> > will
> > > share one storage engine instance.
> > >
> > > Once a storage engine instance is created, Cassandra sever issues
> > commands
> > > to the engine instance to performance data storage and retrieval tasks
> > such
> > > as opening a column family, managing column families and streaming.
> > >
> > > How to config storage engine for different keyspaces? It's still open
> for
> > > discussion. One proposal is that we can add the storage engine option
> in
> > > the create keyspace cql command, and potentially we can overwrite the
> > > option per C* node in its config file.
> > >
> > > Under that API, we implemented a new storage engine, based on RocksDB,
> > > called RocksEngine. In long term, we want to support most of C*
> existing
> > > features in RocksEngine, and we want to build it in a progressive
> manner.
> > > For the first version of the RocksDBEngine, we support following
> > features:
> > > Most of non-nested data types
> > > Table schema
> > > Point query
> > > Range query
> > > Mutations
> > > Timestamp
> > > TTL
> > > Deletions/Cell tombstones
> > > Streaming
> > > We do not supported following features in first version yet:
> > > Multi-partition query
> > > Nested data types
> > > Counters
> > > Range tombstone
> > > Materialized views
> >

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread kurt greaves
>
> So you’d rather continue to lie to users about the stability of the
> feature rather than admitting it was merged in prematurely?

It was merged prematurely, but a lot has changed since then and a lot of
fixes have been made, and now it's really no more awful than any other
component of Cassandra. A lot of the commentary in here is coming from
people who have had no part of the recent changes to MV's, and as 3.11.1 is
not even out yet I doubt anyone really has any idea how stable they
currently are. In fact I doubt many people have even operated or consulted
on clusters with the most recent versions of 3.0 or 3.11. It's kind of
really annoying that this has come up now, after we've already done a lot
of the work to fix the known issues, and everyone seems to just be saying
"they are so broken" but no one can really provide any evidence why.

Ideally, we’d follow a process that looks a lot more like this:
> 1. New feature is built with an opt in flag.  Unknowns are documented, the
> risk of using the feature is known to the end user.
> 2. People test and use the feature that know what they’re doing.  They are
> able to read the code, submit patches, and help flush out the issues.  They
> do so in low risk environments.  In the case of MVs, they can afford to
> drop and rebuild the view over a week, or rebuild the cluster altogether.
> We may not even need to worry as much about backwards compatibility.
> 3. The feature matures.  More tests are written.  More people become aware
> of how to contribute to the feature’s stability.
> 4. After a while, we vote on removing the feature flag and declare it
> stable for general usage.


No I don't think this works very well for Cassandra because features are
often heavily intertwined with other components of Cassandra, and often a
new feature relies on making changes to other components of Cassandra. At
least this is true for any feature that is large enough to justify having
an opt-in flag. This will lead down the path of "oh it's only
experimental/opt-in so we don't need to worry about testing every single
component", which is wrong.
We provide a database, and users expect stability from all aspects of the
database, at all times. We should be working to fix bugs early so we can
have confidence in the entire database from very early in the release
branch. We shouldn't provide a database that people will say "don't use it
in production until at least the .15 patch release".

What I see is sufficient distrust coming from core committers, including
> the author of the v1 design, to warrant opt-in for MVs.

Core committers who have had almost nothing to do with MV since quite some
time ago.  Also I'm skeptical of how much first hand experience these core
committers have with MV's.

We already have those for UDFs and CDC.
> We should have more: for triggers, SASI, and MVs, at least. Operators need
> a way to disable features they haven’t validated.

After a bit more thought I've changed my mind on this. Operators do need a
way to disable features, but it makes a lot more sense to have that as part
of the auth/roles system rather than yaml properties. Plus as previously
noted, I'm not of the opinion we should release features (even in a
beta/experimental form) at all, and we should be reasonably confident in
the entire system and any new features being introduced prior to releasing
them. We should also better practice incremental release of features,
starting with a bare minimum, or subset of what we want the end product to
be, rather than releasing a massive change and then calling it experimental
for years until we can somehow deduce that it is stable enough. This could
have been done for MV's by starting with an append only use case, and then
moving onto the more complex transactional use case.
​