Re: Compact Storage and SuperColumn Tables in 4.0/trunk

2017-09-19 Thread Alex P
> If we provide a way to drop the flag, but still access the data, I think that 
> is fine and perfectly reasonable.  If the proposal here is that users who 
> have data in COMPACT STORAGE tables have no way to upgrade to 4.0 and still 
> access that data without exporting it to a brand new table, then I am against 
> it.  Can you clarify which thing is being proposed?  It is not clear to me.


A bit of details on how compact storage and all thrift tables are implemented:

When a table is created through thrift or with COMPACT STORAGE flag, it has a 
`value` column, which is invisible when doing any CQL queries and only seen 
through Thrift. 
With SuperColumn families, internally (on the storage level) have a partition 
key, clustering and a value column that has a type of `map<>`. Thrift exposes 
it as a “normal” super column family. CQL, however, does not expose this `map` 
column. Instead, it translates the key of the map into the second clustering 
and a map value as a regular column.

All of this requires quite some special-casing everywhere across the CQL layer, 
in order to hide/show and translate the columns depending on whether the table 
is dense, super and so on.



For more details you can take a look at 8099 or 12373.

In short: dropping a COMPACT STORAGE flag means that your tables will be 
accessible and their internal representation (e.g. hidden value column) will be 
exposed as if it was a normal column. No data will be lost, no data will be 
inaccessible. You can take a look at the details of CASSANDRA-10857 if you want 
more details.

As regards SuperColumn families, my proposal is to have a 100% support in 
3.0/3.11 (LWTs, counters, all sorts of queries, exactly like they were 
accessible through CQL in 2.2).

There will be a clear upgrade path, but I suggest that the DROP COMPACT STORAGE 
has to be in 3.x only. 

4.x will still make the same data available, but expose the whole internal CQL 
structure, together with a usually “hidden" compact value column, without any 
legacy special-casing. 







> On 19 Sep 2017, at 17:23, Jeremiah D Jordan  wrote:
> 
> I think that all the work to support Compact Storage tables from CQL seems 
> like wasted effort if we are going to tell people “just kidding, you have to 
> migrate all your data”.  I do not think supporting “COMPACT STORAGE” as a 
> table option matters one way or the other.  But I do think being able to read 
> the data that was in a table created that way is something we need to have a 
> path forward for.
> 
>> since thrift is not supported on trunk/4.0, it makes it much less appealing 
>> or even necessary
> 
> I think that the fact thrift is not supported on trunk/4.0 makes accessing 
> said data from CQL *MORE* necessary and appealing.
> 
>> possibility drop a Compact Storage flag and expose them as “normal" tables, 
>> there was an idea of removing the Compact Tables from 4.x altogether. 
> 
> If we provide a way to drop the flag, but still access the data, I think that 
> is fine and perfectly reasonable.  If the proposal here is that users who 
> have data in COMPACT STORAGE tables have no way to upgrade to 4.0 and still 
> access that data without exporting it to a brand new table, then I am against 
> it.  Can you clarify which thing is being proposed?  It is not clear to me.
> 
> -Jeremiah
> 
> 
>> On Sep 19, 2017, at 7:10 AM, Oleksandr Petrov > > wrote:
>> 
>> As you may know, SuperColumn Tables did not work in 3.x the way they worked 
>> in 2.x. In order to provide everyone with a reasonable upgrade path, we've 
>> been working on CASSANDRA-12373[1], that brings in support for SuperColumn 
>> tables as close to 2.x as possible. The patch is planned to land 
>> cassandra-3.0 and cassandra-3.11 branches only, since the patch for trunk 
>> will require even more work and, since thrift is not supported on trunk/4.0, 
>> it makes it much less appealing or even necessary. The idea behind the 
>> support for SuperColumns was always only to allow people to smoothly migrate 
>> off them in 3.0/3.11 world, not to have them as a primary feature.
>> 
>> SuperColumns are not the only type of Compact Table, there are more. After 
>> CASSANDRA-8099[2], Compact Tables are special-cased and have special schema 
>> layout with some columns hidden from CQL, that allows them to be used from 
>> Thrift. But, except for the fact they’re accessible from Thrift, there are 
>> no advantages to use them with the new storage. In order to allow people to 
>> “expose” the internal structure of the compact tables to make them fully 
>> accessible in CQL, CASSANDRA-10857[3] was created.
>> 
>> In the light of the fact that 4.0 will not have reasonable SuperColumn 
>> support (due to related complexity and amount of special-cases required to 
>> support it in 4.0) and a possibility drop a Compact Storage flag and expose 
>> them as “normal" tables, there was an idea of removing the Compact Tables 
>> from 4.x alto

Re: V5 as a protocol beta version in 3.11

2017-11-07 Thread Alex P
This makes sense and is alongside with the previous discussions about v5. I 
agree with Adam and Jeremiah on that.

Thank you for the input. I will adjust the tests tomorrow.

Best regards,
Alex Petrov

> On 7. Nov 2017, at 18:32, Adam Holmberg  wrote:
> 
> I agree that it is okay to leave v5 beta behind. As I recall, the point of
> beta was less about trying stuff early, but more to allow early
> implementation and testing of new protocol features, before the scope was
> finalized. Now that v5 proper has diverged from beta it is no longer
> supported. I don't see much value in back-porting, nor do I think we
> should  increment versions in order to maintain compatibility with
> something that was expressly beta.
> 
> I think we should disable v5 testing in 3.x branch and let the v5 spec
> continue to evolve in *non-beta* status in 4.0 until it is finalized upon
> release.
> 
> Adam Holmberg
> 
> On Tue, Nov 7, 2017 at 10:57 AM, J. D. Jordan 
> wrote:
> 
>> Again. V5 beta in 3.11 was always meant to stop working when future things
>> happened to V5 in the drivers and in C*.  I see no problem with leaving the
>> beta V5, which is an opt in thing to try out, in 3.11 alone. 4.0 will have
>> the full non beta V5 with extra stuff in it, and will not work with beta V5.
>> Nothing uses the beta V5 by default. It is an opt in thing to be used if
>> you wanted to try out stuff early.
>> 
>> -Jeremiah
>> 
>>> On Nov 7, 2017, at 11:39 AM, Oleksandr Petrov <
>> oleksandr.pet...@gmail.com> wrote:
>>> 
>>> This is an option, you're right. In this case v5 will have just one
>>> feature, however, and the only feature (Duration type) should work with
>> via
>>> CustomTypes through v4.
>>> 
>>> Looks like the Jira numbers were off, so let me do it again:
>>> 
>>> In 3.11 we have:
>>> * CASSANDRA-12838 - Extend native protocol flags and add supported
>>> versions to the SUPPORTED response
>>> * CASSANDRA-12142 - Add "beta" version native protocol flag
>>> * CASSANDRA-12850 - Add the duration type to the protocol V5 <-- (this
>>> one should also work with v4)
>>> 
>>> In 4.0 we have
>>> * CASSANDRA-10786 - Include hash of result set metadata in prepared
>>> statement id
>>> 
>>> And the options:
>>> 
>>> * (1) remove v5 from 3.11 by reverting #12838 and #12142
>>> * (2) support v5 in 3.11 forever and backport #10786
>>> * (3) bump 4.0 version to v6 and make sure that #10786 is v6
>>> 
>>> Question with (1) is mostly whether or not we would like to cut another
>>> version release because of (in essence) #12838 only, since #12142 is not
>>> relevant in the context and #12850 will still work.
>>> 
>>> 
>>> 
 On Tue, Nov 7, 2017 at 4:19 PM Jonathan Haddad 
>> wrote:
 
 The other option, to avoid having two different v5 implementations, is
>> to
 bump 4.0’s protocol version to 6.
 On Tue, Nov 7, 2017 at 6:48 AM Jeremiah D Jordan <
 jeremiah.jor...@gmail.com>
 wrote:
 
> My 2 cents.  When we added V5 to 3.x wasn’t it added as a beta protocol
> for tick/tock stuff and known that when a new version came out it would
> most possibly break the older releases V5 beta stuff? Or at the very
 least
> add new things to V5.  So I see no reason to need to add more new
 features
> to 3.11 v5.
> 
> -Jeremiah
> 
>> On Nov 7, 2017, at 9:41 AM, Oleksandr Petrov <
 oleksandr.pet...@gmail.com>
> wrote:
>> 
>> Hi everyone,
>> 
>> Currently, 3.11 supports V5 as a protocol version. However, all new
>> features are now going to 4.0, which is going to be a new feature
> release.
>> 
>> Right now we have two v5 features:
>> 
>> - CASSANDRA-10786 <
> https://issues.apache.org/jira/browse/CASSANDRA-10786>
>> - CASSANDRA-12838 <
> https://issues.apache.org/jira/browse/CASSANDRA-12838>
>> 
>> 
>> #12838 is adding duration type, which is a nice addition. #10786 is
 also
>> useful, but is more of an edge cases for users with huge clusters
 and/or
>> frequent schema changes.
>> 
>> If we leave v5 in 3.11, we'll have to always backport all v5 features
 to
>> 3.11. This is something that hasn't been done in #10786. So the
 question
>> is: are we ready to commit and support v5 in 3.11 "forever", or should
 we
>> stop until it went too far and remove v5 from 3.11 since it's still in
> beta
>> there.
>> 
>> Looking forward to hear your opinion,
>> 
>> 
>> --
>> Alex Petrov
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
 
>>> --
>>> Alex Petrov
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>