Re: [DISCUSS] Introduce CREATE TABLE LIKE grammer

guo Maxwell Mon, 04 Nov 2024 07:28:15 -0800

Hi stefan
1、yes, cross-keyspace copying will be much complicated than copying under
same keyspace , but I think we can support it in the future , and I think
it is under the scope of this CEP , so I add it .Or is it that the work
planned for the next step should not be listed here for the time being?
I don't know the rules very well here, and I hope if you can help point out
the unreasonable points 😀 , because I do plan to complete this task,
although I have only implemented the same keyspace now.
2、yes, you are right, I gave up ALL at the first time , But after I replied
to yifan’s email, I communicated with him privately through slack. In the
end, I was not strongly opposed to ALL (Sorry, we communicated in Chinese,
https://the-asf.slack.com/archives/D07SXB787HN/p1729136909357689), In
addition, I later saw that you were +0, so I added ALL back.
3、the change to Parse.g will be like :


> /**
>  * CREATE TABLE [IF NOT EXISTS] <NEW_TABLE>
>  * LIKE <OLD_TABLE>
>  * [ WITH OPTIONS AND INDEXES AND TRIGGERS ]
>  */
> copyTableStatement returns  [CopyTableStatement.Raw stmt]
>     @init { boolean ifNotExists = false; }
>     : K_CREATE K_COLUMNFAMILY newCf=columnFamilyName LIKE 
> oldCf=columnFamilyName
>       { $stmt = new CopyTableStatement.Raw(newCf, oldCf); }
>       tableLikeOptions[stmt]
>     ;
>
> tableLikeOptions[CopyTableStatement.Raw stmt]
>     : ( K_WITH tableLikeSingleOption[stmt] ( K_AND 
> tableLikeSingleOption[stmt] )*)?
>     ;
>
> tableLikeSingleOption[CopyTableStatement.Raw stmt]
>     : option=STRING_LITERAL { $stmt.extendWithLikeOptions($option.text); }
>     ;
>
> I don’t plan to reuse the Create table definition file, and there doesn’t
seem to be much need. And I have made a explanation  in the cep file
<https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>

Thanks.

Štefan Miklošovič <[email protected]> 于2024年11月4日周一 17:00写道：

> Hi Maxwell,
>
> 1) I noticed that there is table copying across keyspaces in your goal
> number 2) in the CEP. Is this correct? I was thinking that we are doing
> same-keyspace copying for now and it will be considered later, as you
> elaborate on that further down the document. Cross-keyspace copying would
> mean (among other things) that we would need to create UDTs in another
> keyspace as well which would complicate it etc ...
>
> 2) I also see this
>
> CREATE TABLE <NEW_TABLE> LIKE <OLD_TABLE>  [ WITH ALL | [ INDEXES AND
> TRIGGERS]]
>
> Is this really correct? I think we agreed that ALL will not be supported.
> You gave up on ALL in this comment of yours (the first sentence) (1)
>
> 3) It would be great if you were more explicit about the proposed CQL
> changes in such a way that after the CEP is delivered, it would be possible
> to override the options on a new table. Basically what Dave summarized here
> (2) at the very bottom. All three examples should be mentioned in CEP for
> being explicit about our intentions.
>
> After this is all reflected, I will be glad to vote on this CEP in the
> other thread.
>
> (1) https://lists.apache.org/thread/d485w6lxvpoztmjnxj8msj0jjt3d5ltk
> (2) https://lists.apache.org/thread/odc1s1pt5m2tk76owxq61y55kytf13sf
>
> On Wed, Oct 30, 2024 at 4:28 AM guo Maxwell <[email protected]> wrote:
>
>> So we should be able to start voting on this now.
>>
>> guo Maxwell <[email protected]> 于2024年10月28日周一 17:20写道：
>>
>>> Here  is the latest updated CEP-43
>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>
>>>
>>> guo Maxwell <[email protected]> 于2024年10月24日周四 19:53写道：
>>>
>>>> yes，you are right. I will add this
>>>>
>>>> Štefan Miklošovič <[email protected]>于2024年10月24日 周四下午4:42写道：
>>>>
>>>>> The CEP should also mention that copying system tables or virtual
>>>>> tables or materialized views and similar are not supported and an attempt
>>>>> of doing so will error out.
>>>>>
>>>>> On Thu, Oct 24, 2024 at 7:16 AM Dave Herrington <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Strong +1 to copy all options by default. This is intuitive to me.
>>>>>> Then I would like to explicitly override any options of my choosing.
>>>>>>
>>>>>> -Dave
>>>>>>
>>>>>> On Wed, Oct 23, 2024 at 9:57 PM guo Maxwell <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> OK，thank you for your suggestions ，I will revise the CEP and copy
>>>>>>> table OPTIONS by default.
>>>>>>>
>>>>>>> Jon Haddad <[email protected]>于2024年10月23日 周三下午9:18写道：
>>>>>>>
>>>>>>>> Also strongly +1 to copying all the options.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 23, 2024 at 5:52 AM Josh McKenzie <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'm a very strong +1 to having the default functionality be to
>>>>>>>>> copy *ALL* options.
>>>>>>>>>
>>>>>>>>> Intuitively, as a user, if I tell a software system to make a
>>>>>>>>> clone of something I don't expect it to be shallow or a subset 
>>>>>>>>> defined by
>>>>>>>>> some external developer somewhere. I expect it to be a clone.
>>>>>>>>>
>>>>>>>>> Adding in some kind of "lean" mode or "column only" is fine if
>>>>>>>>> someone can make a cogent argument around its inclusion. I don't 
>>>>>>>>> personally
>>>>>>>>> see a use-case for it right now but definitely open to being educated.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 23, 2024, at 3:03 AM, Štefan Miklošovič wrote:
>>>>>>>>>
>>>>>>>>> options are inherently part of that table as well, same as schema.
>>>>>>>>> In fact, _schema_ includes all options. Not just columns and its 
>>>>>>>>> names. If
>>>>>>>>> you change some option, you effectively have a different schema, 
>>>>>>>>> schema
>>>>>>>>> version changes by changing an option. So if we do not copy options 
>>>>>>>>> too, we
>>>>>>>>> are kind of faking it (when we do not specify WITH OPTIONS).
>>>>>>>>>
>>>>>>>>> Also, imagine a situation where Accord is merged to trunk. It
>>>>>>>>> introduces a new schema option called "transactional = full" which is 
>>>>>>>>> not
>>>>>>>>> default. (I am sorry if I did the spelling wrong here). So, when you 
>>>>>>>>> have a
>>>>>>>>> table with transactional support and you do "create table ks.tb_copy 
>>>>>>>>> like
>>>>>>>>> ks.tb", when you _do not_ copy all options, this table will _not_ 
>>>>>>>>> become
>>>>>>>>> transactional.
>>>>>>>>>
>>>>>>>>> The next thing you go to do is to execute some transactions
>>>>>>>>> against this table but well ... you can not do that, because your 
>>>>>>>>> table is
>>>>>>>>> not transactional, because you have forgotten to add "WITH OPTIONS". 
>>>>>>>>> So you
>>>>>>>>> need to go back to that and do "ALTER ks.tb_copy WITH transactional = 
>>>>>>>>> full"
>>>>>>>>> just to support that.
>>>>>>>>>
>>>>>>>>> I think that you see from this pattern that it is way better if we
>>>>>>>>> copy all options by default instead of consciously opt-in into them.
>>>>>>>>>
>>>>>>>>> also:
>>>>>>>>>
>>>>>>>>> "but I think there are also some users want to do basic column
>>>>>>>>> information copy"
>>>>>>>>>
>>>>>>>>> where is this coming from? Do you have this idea somehow
>>>>>>>>> empirically tested? I just do not see why somebody would want to have
>>>>>>>>> Cassandra's defaults instead of what a base table contains.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 23, 2024 at 8:28 AM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> The reason for using OPTION keyword is that I want to provide
>>>>>>>>> users with more choices .
>>>>>>>>> The default behavior for copying a table is to copy the basic item
>>>>>>>>> of table （column and their data type，mask，constraint），others thing 
>>>>>>>>> belongs
>>>>>>>>> to the table like option，views，trigger
>>>>>>>>> are optional in my mind.
>>>>>>>>> You are absolutely right that users may want to copy all stuff but
>>>>>>>>> I think there are aslo some users want to do basic column information
>>>>>>>>> copy，So I just give them a choice。As we know that the number of table
>>>>>>>>> parameters is not
>>>>>>>>> small，compression，compaction，gc_seconds，bf_chance，speculative_retry 
>>>>>>>>> and so
>>>>>>>>> on.
>>>>>>>>>
>>>>>>>>> Besides we can see that pg have also the keyword
>>>>>>>>> COMMENT，COMPRESSION which have the similar behavior as our OPTION 
>>>>>>>>> keyword。
>>>>>>>>>
>>>>>>>>> So that is why I add this keyword OPTION.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Štefan Miklošovič <[email protected]>于2024年10月22日
>>>>>>>>> 周二下午11:40写道：
>>>>>>>>>
>>>>>>>>> The problem is that when I do this minimal CQL which shows this
>>>>>>>>> feature:
>>>>>>>>>
>>>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb;
>>>>>>>>>
>>>>>>>>> then you are saying that when I _do not_ specify WITH OPTIONS then
>>>>>>>>> I get Cassandra's defaults. Only after I specify WITH OPTIONS, it 
>>>>>>>>> would
>>>>>>>>> truly be a copy.
>>>>>>>>>
>>>>>>>>> This is not a good design. Because to have an exact copy, I have
>>>>>>>>> to make a conscious effort to include OPTIONS as well. That should 
>>>>>>>>> not be
>>>>>>>>> the case. I just want to have a copy, totally the same stuff, when I 
>>>>>>>>> use
>>>>>>>>> the minimal version of that statement. It would be better to opt-out 
>>>>>>>>> from
>>>>>>>>> options like
>>>>>>>>>
>>>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb WITHOUT OPTIONS (you feel me)
>>>>>>>>> but we do not support this (yet).
>>>>>>>>>
>>>>>>>>> On Tue, Oct 22, 2024 at 5:28 PM Štefan Miklošovič <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> I just don't see OPTIONS as important. When I want to copy a
>>>>>>>>> table, I am copying a table _with everything_. Options included, by
>>>>>>>>> default. Why would I want to have a copy of a table with options 
>>>>>>>>> different
>>>>>>>>> from the base one?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 21, 2024 at 3:55 PM Bernardo Botella <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Hi Guo,
>>>>>>>>>
>>>>>>>>> +1 for the CONSTRAINTS keyword to be added into the default
>>>>>>>>> behavior.
>>>>>>>>>
>>>>>>>>> Bernardo
>>>>>>>>>
>>>>>>>>> On Oct 21, 2024, at 12:01 AM, guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I think the CONSTRAINTS keyword  keyword may be in the same
>>>>>>>>> situation as datamask.
>>>>>>>>> Maybe it is better to include  constraints into  the default
>>>>>>>>> behavior of table copy together with column name, column data type 
>>>>>>>>> and data
>>>>>>>>> mask.
>>>>>>>>>
>>>>>>>>> guo Maxwell <[email protected]> 于2024年10月21日周一 14:56写道：
>>>>>>>>>
>>>>>>>>> To yifan :
>>>>>>>>> I don't mind adding the ALL keyword, and it has been updated into
>>>>>>>>> CEP.
>>>>>>>>>
>>>>>>>>> As all you can see, our original intention was that the grammar
>>>>>>>>> would not be too complicated, which is what I described in cep
>>>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>>>>> .
>>>>>>>>> We gave up PG-related grammar, including INCLUDING/EXCLUDING and
>>>>>>>>> so on .
>>>>>>>>>
>>>>>>>>> guo Maxwell <[email protected]> 于2024年10月21日周一 14:52写道：
>>>>>>>>>
>>>>>>>>> Hi ,
>>>>>>>>> To sefan :
>>>>>>>>> I may want to explain that if there is no OPTION keyword in the
>>>>>>>>> CQL statement, then the newly created table will only have the
>>>>>>>>> original table's  column name 、column type and data mask ,I think 
>>>>>>>>> this is
>>>>>>>>> the most basic choice when copying tables to users.
>>>>>>>>> Then  we do some  addition, we can add original table's table
>>>>>>>>> options like compaction strategy/compress strategy、index and so on.
>>>>>>>>>
>>>>>>>>> Recently, I have also thought about the situation of CONSTRAINTS
>>>>>>>>> keyword. I think it is similar to data mask. Agree that it should be
>>>>>>>>> included in the basic options of  table copy (column name, column 
>>>>>>>>> data type
>>>>>>>>> , column data mask and constraints).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dave Herrington <[email protected]> 于2024年10月19日周六 01:15写道：
>>>>>>>>>
>>>>>>>>> It seems like a natural extension of the CREATE TABLE statement.
>>>>>>>>> Looking forward to using it in the future.
>>>>>>>>>
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> On Thu, Oct 17, 2024 at 5:11 PM Štefan Miklošovič <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Right?! Reads like English, the impact on the existing CQL is
>>>>>>>>> minimal. One LIKE which basically needs to be there and keywords of 
>>>>>>>>> logical
>>>>>>>>> "components" which seamlessly integrate with WITH.
>>>>>>>>>
>>>>>>>>> I would _not_ use WITH CONSTRAINTS because constraints will be
>>>>>>>>> inherently part of a table schema. It is not an "option". We can not
>>>>>>>>> "opt-out" from them. Remember we are copying a table here so if a 
>>>>>>>>> base one
>>>>>>>>> has constraints, its copy will have them too. A user can subsequently
>>>>>>>>> "ALTER" them.
>>>>>>>>>
>>>>>>>>> On Thu, Oct 17, 2024 at 5:31 PM Dave Herrington <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Basing it on CREATE TABLE, the BNF definition of the simple
>>>>>>>>> implementation would look something like this:
>>>>>>>>>
>>>>>>>>> create_table_statement::= CREATE TABLE [ IF NOT EXISTS ]
>>>>>>>>> table_name LIKE base_table_name
>>>>>>>>> [ WITH included_objects ] [ [ AND ] table_options ]
>>>>>>>>> table_options::= COMPACT STORAGE [ AND table_options ]
>>>>>>>>> | CLUSTERING ORDER BY '(' clustering_order ')'
>>>>>>>>> [ AND table_options ]  | options
>>>>>>>>> clustering_order::= column_name (ASC | DESC) ( ',' column_name
>>>>>>>>> (ASC | DESC) )*
>>>>>>>>> included_objects::= dependent_objects [ AND dependent_objects ]
>>>>>>>>> dependent_objects:= INDEXES | TRIGGERS | CONSTRAINTS | VIEWS
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> CREATE TABLE [ IF NOT EXISTS ] [<keyspace_name>.]<table_name> LIKE
>>>>>>>>> [<keyspace_name>.]<base_table_name>
>>>>>>>>>   [ WITH [ <included_objects > ]
>>>>>>>>>   [ [ AND ] [ <table_options> ] ]
>>>>>>>>>   [ [ AND ] CLUSTERING ORDER BY [ <clustering_column_name> (ASC |
>>>>>>>>> DESC) ] ]
>>>>>>>>> ;
>>>>>>>>>
>>>>>>>>> Examples:
>>>>>>>>>
>>>>>>>>> -- Create base table:
>>>>>>>>> CREATE TABLE cycling.cyclist_name (
>>>>>>>>>   id UUID PRIMARY KEY,
>>>>>>>>>   lastname text,
>>>>>>>>>   firstname text
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> -- Create an exact copy of the base table, but do not create any
>>>>>>>>> dependent objects:
>>>>>>>>> CREATE TABLE cycling.cyclist_name2 LIKE cycling.cyclist_name;
>>>>>>>>>
>>>>>>>>> -- Create an exact copy with all dependent objects (constraints
>>>>>>>>> excluded for now):
>>>>>>>>> CREATE TABLE cycling.cyclist_name3 LIKE cycling.cyclist_name
>>>>>>>>> WITH INDEXES AND TRIGGERS AND VIEWS;
>>>>>>>>>
>>>>>>>>> -- Create a copy with LCS compaction, a default TTL and all
>>>>>>>>> dependent objects except indexes:
>>>>>>>>> CREATE TABLE cycling.cyclist_name4 LIKE cycling.cyclist_name
>>>>>>>>> WITH TRIGGERS AND VIEWS
>>>>>>>>> AND compaction = { 'class' :  'LeveledCompactionStrategy' }
>>>>>>>>> AND default_time_to_live = 86400;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This seems pretty clean & straightforward.
>>>>>>>>>
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> On Thu, Oct 17, 2024 at 4:05 PM Dave Herrington <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> This simple approach resonates with me.  I think the Cassandra doc
>>>>>>>>> uses "INDEXES" as the plural for index, i.e.:
>>>>>>>>> https://cassandra.apache.org/doc/stable/cassandra/cql/indexes.html
>>>>>>>>>
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> On Thu, Oct 17, 2024 at 2:39 PM Štefan Miklošovič <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Well we could do something like:
>>>>>>>>>
>>>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb WITH INDICES AND TRIGGERS AND
>>>>>>>>> compaction = {'class': '.... } AND ...
>>>>>>>>>
>>>>>>>>> but I can admit it might be seen as an overreach and I am not sure
>>>>>>>>> at all how it would look like in the implementation because we would 
>>>>>>>>> need
>>>>>>>>> to distinguish WITH INDICES from table options.
>>>>>>>>>
>>>>>>>>> I would
>>>>>>>>>
>>>>>>>>>    1. +0 on ALL. - we don't need this. If we have just INDICES,
>>>>>>>>>    TRIGGERS, VIEWS at this point, I don't think enumerating it all is 
>>>>>>>>> too much
>>>>>>>>>    to ask. This is just an implementation detail and if we find it 
>>>>>>>>> necessary
>>>>>>>>>    we can add it later. If you feel strongly about this then add that 
>>>>>>>>> but it
>>>>>>>>>    is not absolutely necessary.
>>>>>>>>>    2. omit OPTIONS - aren't all options copied by default? That
>>>>>>>>>    is the goal of the CEP, no? We might just use normal CQL while
>>>>>>>>>    overriding from the base table
>>>>>>>>>    3. mix keywords like TRIGGERS / INDICES / CONSTRAINTS into
>>>>>>>>>    normal table creation statement
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Oct 17, 2024 at 3:20 PM Yifan Cai <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I would second Štefan's option for functionality simplicity. It
>>>>>>>>> seems to be unnecessary to have the keywords for both inclusion and
>>>>>>>>> exclusion in the CEP. If needed, the exclusion (WITHOUT) can be 
>>>>>>>>> introduced
>>>>>>>>> later. It would still be backward compatible.
>>>>>>>>>
>>>>>>>>> Regarding "CREATE TABLE ks.tb_copy LIKE ks.tb WITH compaction =
>>>>>>>>> {'class': '.... } AND ... ", I think it only overrides the table 
>>>>>>>>> options.
>>>>>>>>> The CEP suggests the coarse-grained keyword for each category like 
>>>>>>>>> table
>>>>>>>>> options, indexes, etc. The functionality provided is not identical.
>>>>>>>>>
>>>>>>>>> I understand that the suggestions are to make operators' life
>>>>>>>>> easier by achieving table creation in a single statement. What is 
>>>>>>>>> being
>>>>>>>>> proposed in the CEP seems to be at a good balance point. Operators can
>>>>>>>>> alter the table options if needed in the follow-up ALTER table 
>>>>>>>>> statement.
>>>>>>>>>
>>>>>>>>> - Yifan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Oct 17, 2024 at 1:41 PM Štefan Miklošovič <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> I think we are starting to complicate it. For me the most
>>>>>>>>> important question is who is actually this feature for? If people 
>>>>>>>>> want to
>>>>>>>>> just prototype something fast or they just want to have "the same 
>>>>>>>>> table
>>>>>>>>> just under a different name", I think that is going to be used in 99% 
>>>>>>>>> of
>>>>>>>>> cases.
>>>>>>>>>
>>>>>>>>> My assumption of using WITH which I think I proposed first (4th
>>>>>>>>> post in this thread) was to just blindly copy the most important 
>>>>>>>>> "parts"
>>>>>>>>> logically related to a table, be it indices, materialized views, or
>>>>>>>>> triggers and enable / disable them as we wish. If no "WITH" is used, 
>>>>>>>>> then
>>>>>>>>> we just get a table with nothing else. "WITH" will opt-in into that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Seeing us contemplating using "INCLUDING" and "EXCLUDING" on
>>>>>>>>> individual options makes me sad a little bit. I think we are
>>>>>>>>> over-engineering this. I just don't see a reasonable use-case where 
>>>>>>>>> users
>>>>>>>>> would need to cherry-pick what they want and what not. Isn't that 
>>>>>>>>> just too
>>>>>>>>> complicated? If a table being copied drifts away too much from the 
>>>>>>>>> original
>>>>>>>>> one then users would be better off with creating a brand new table 
>>>>>>>>> with CQL
>>>>>>>>> as they are used to, not dealing with "copying" at all. More we drift 
>>>>>>>>> from
>>>>>>>>> what the original table was like, the less useful this feature is.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Oct 16, 2024 at 10:03 PM Dave Herrington <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Sorry that I overlooked the definition of the default in the CEP.
>>>>>>>>> I did look for it but I didn’t see it.
>>>>>>>>>
>>>>>>>>> I think the default behavior you explained makes perfect sense &
>>>>>>>>> what one would expect.
>>>>>>>>>
>>>>>>>>> I like the flexibility of INCLUDING and EXCLUDING that you are
>>>>>>>>> considering.
>>>>>>>>>
>>>>>>>>> Would it make sense to use WITH for table options, which would
>>>>>>>>> make it easy (and less confusing IMHO) to override the defaults from 
>>>>>>>>> the
>>>>>>>>> source table, then use INCLUDING/EXCLUDING for all non-table options 
>>>>>>>>> such
>>>>>>>>> as constraints and indices?
>>>>>>>>>
>>>>>>>>> It seems this would be easier to document as well, as it could
>>>>>>>>> just point to the CREATE TABLE doc for the options, rather than 
>>>>>>>>> trying to
>>>>>>>>> explain a bunch of keywords that map to table options.
>>>>>>>>>
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> David A. Herrington II
>>>>>>>>> President and Chief Engineer
>>>>>>>>> RhinoSource, Inc.
>>>>>>>>>
>>>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>>>
>>>>>>>>> www.rhinosource.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Oct 16, 2024 at 7:57 PM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> To yifan :
>>>>>>>>> At the beginning of the period, I also thought about adding the
>>>>>>>>> keyword ALL, refer to pg
>>>>>>>>> <https://www.postgresql.org/docs/current/sql-createtable.html> ,
>>>>>>>>> but  I give up when writing cep as I find that there may be not so 
>>>>>>>>> many
>>>>>>>>> properties (only three) to copy for C* and
>>>>>>>>> It is possible to decide what is needed and what is not in a very
>>>>>>>>> simple cql, as our ALL is only three properties here. I want to keep 
>>>>>>>>> it as
>>>>>>>>> simple as possible (based on the advice given by Benjamin), So I 
>>>>>>>>> grouped
>>>>>>>>> the properties of the table into one category and expressed it
>>>>>>>>> with OPTION keyword.
>>>>>>>>>
>>>>>>>>> But if we are going to split the first keyword OPTION  to
>>>>>>>>> COMPRESSION 、COMPACTION、COMMENT and so on. I am +1 on adding ALL back 
>>>>>>>>> as
>>>>>>>>> the properties are so many and it is simple to use ALL instead of
>>>>>>>>> list all properties. Besides I may change my keyword WITH to
>>>>>>>>> INCLUDING and adding another keyword EXCLUDING to flexibly copy table
>>>>>>>>> properties through simple sql statements, like using   1 not  2
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    1.  CREATE TABLE newTb like oldTb INCLUDING ALL EXCLUDING
>>>>>>>>>    INDEXES AND COMMENTS.
>>>>>>>>>    2.  CREATE TABLE newTb like oldTb INCLUDING COMPRESSION
>>>>>>>>>    CONSTRAINTS GENERATED IDENTITY STATISTICS STORAGE
>>>>>>>>>
>>>>>>>>> Conclusion: If there may be more keywords to consider in the
>>>>>>>>> future, such as more than 4 , I am +1 on adding ALL back .
>>>>>>>>>
>>>>>>>>> To Dave :
>>>>>>>>>    Default behavior is only copy column name, data type ,data
>>>>>>>>> mask , you can see more detail from  CEP-43
>>>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Patrick McFadin <[email protected]> 于2024年10月17日周四 06:43写道：
>>>>>>>>>
>>>>>>>>> +1 That makes much more sense in my experience.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 16, 2024 at 12:12 PM Dave Herrington <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> I'm coming at this with both a deep ANSI SQL background as well as
>>>>>>>>> CQL background.
>>>>>>>>>
>>>>>>>>> Defining the default behavior is the starting point.  What gets
>>>>>>>>> copied if we do "CREATE TABLE new_table LIKE original_table;" without 
>>>>>>>>> a
>>>>>>>>> WITH clause?
>>>>>>>>>
>>>>>>>>> Then, you build on that with the specific WITH options.  WITH ALL
>>>>>>>>> catches everything.
>>>>>>>>>
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> On Wed, Oct 16, 2024 at 11:16 AM Yifan Cai <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> "WITH ALL" seems to be a natural addition to the directives. What
>>>>>>>>> do you think about adding the fifth keyword ALL to retain all fields 
>>>>>>>>> of the
>>>>>>>>> table schema?
>>>>>>>>>
>>>>>>>>> For instance, CREATE TABLE new_table LIKE original_table WITH ALL,
>>>>>>>>> it replicates options, indexes, triggers, constraints and any 
>>>>>>>>> applicable
>>>>>>>>> kinds that are introduced in the future.
>>>>>>>>>
>>>>>>>>> - Yifan
>>>>>>>>>
>>>>>>>>> On Wed, Oct 16, 2024 at 7:46 AM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Disscussed with Bernardo on slack，and +1 with his advice on adding
>>>>>>>>> a fourth keyword.
>>>>>>>>>
>>>>>>>>> The keyword would be  CONSTRAINTS ， any more suggestion ？
>>>>>>>>>
>>>>>>>>> guo Maxwell <[email protected]>于2024年10月16日 周三上午9:55写道：
>>>>>>>>>
>>>>>>>>> Hi yifan,
>>>>>>>>> Thanks for bringing this up. The SELECT permission on the original
>>>>>>>>> table is needed. Mysql and PG all have mentioned this, and I also
>>>>>>>>> specifically noticed this in my code.
>>>>>>>>>
>>>>>>>>> I probably missed this in the cep documentation. 😅
>>>>>>>>>
>>>>>>>>> Yifan Cai <[email protected]> 于2024年10月16日周三 07:46写道：
>>>>>>>>>
>>>>>>>>> Thanks for creating the CEP! I think it is missing Bernardo's
>>>>>>>>> comment on "the need for read permissions on the source table".
>>>>>>>>>
>>>>>>>>> CreateTableStatement does not check the permissions outside of the
>>>>>>>>> enclosing keyspace. Having the SELECT permission on the original 
>>>>>>>>> table is a
>>>>>>>>> requirement for CREATE TABLE LIKE.
>>>>>>>>>
>>>>>>>>> - Yifan
>>>>>>>>>
>>>>>>>>> On Sun, Sep 29, 2024 at 11:01 PM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello, everyone ,
>>>>>>>>> I have finished the doc for CEP-43 for CREATE_TABLE_LIKE
>>>>>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-43++Apache+Cassandra+CREATE+TABLE++LIKE>
>>>>>>>>>  as
>>>>>>>>> said before, looking forward to your suggestions.
>>>>>>>>>
>>>>>>>>> Štefan Miklošovič <[email protected]> 于2024年9月25日周三 03:51写道：
>>>>>>>>>
>>>>>>>>> I am sorry I do not follow what you mean, maybe an example would
>>>>>>>>> help.
>>>>>>>>>
>>>>>>>>> On Tue, Sep 24, 2024 at 6:18 PM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If there are multiple schema information changes in one ddl
>>>>>>>>> statement, will there be schema conflicts in extreme cases?
>>>>>>>>> For example, our statement contains both table creation and index
>>>>>>>>> creation.
>>>>>>>>>
>>>>>>>>> guo Maxwell <[email protected]>于2024年9月24日 周二下午8:12写道：
>>>>>>>>>
>>>>>>>>> +1 on splitting this task  and adding the ability to copy tables
>>>>>>>>> through different keyspaces in the future.
>>>>>>>>>
>>>>>>>>> Štefan Miklošovič <[email protected]> 于2024年9月23日周一 22:05写道：
>>>>>>>>>
>>>>>>>>> If we have this table
>>>>>>>>>
>>>>>>>>> CREATE TABLE ks.tb2 (
>>>>>>>>>     id int PRIMARY KEY,
>>>>>>>>>     name text
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> I can either specify name of an index on my own like this:
>>>>>>>>>
>>>>>>>>> CREATE INDEX name_index ON ks.tb2 (name) ;
>>>>>>>>>
>>>>>>>>> or I can let Cassandra to figure that name on its own:
>>>>>>>>>
>>>>>>>>> CREATE INDEX ON ks.tb2 (name) ;
>>>>>>>>>
>>>>>>>>> in that case it will name that index "tb2_name_idx".
>>>>>>>>>
>>>>>>>>> Hence, I would expect that when we do
>>>>>>>>>
>>>>>>>>> ALTER TABLE ks.to_copy LIKE ks.tb2 WITH INDICES;
>>>>>>>>>
>>>>>>>>> Then ks.to_copy table will have an index which is called
>>>>>>>>> "to_copy_name_idx" without me doing anything.
>>>>>>>>>
>>>>>>>>> For types, we do not need to do anything when we deal with the
>>>>>>>>> same keyspace. For simplicity, I mentioned that we might deal with 
>>>>>>>>> the same
>>>>>>>>> keyspace scenario only for now and iterate on that in the future.
>>>>>>>>>
>>>>>>>>> On Mon, Sep 23, 2024 at 8:53 AM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello everyone,
>>>>>>>>>
>>>>>>>>> Cep is being written, and I encountered some problems during the
>>>>>>>>> process. I would like to discuss them with you. If you read the 
>>>>>>>>> description
>>>>>>>>> of this CASSANDRA-7662
>>>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-7662>, we will
>>>>>>>>> find that initially the original creator of this jira did not intend 
>>>>>>>>> to
>>>>>>>>> implement structural copying of indexes, views, and triggers  only the
>>>>>>>>> column and its data type.
>>>>>>>>>
>>>>>>>>> However, after investigating some db related syntax and function
>>>>>>>>> implementation, I found that it may be necessary for us to provide 
>>>>>>>>> some
>>>>>>>>> rich syntax to support the replication of indexes, views, etc.
>>>>>>>>>
>>>>>>>>> In order to support selective copy of the basic structure of the
>>>>>>>>> table (columns and types), table options, table-related indexes, 
>>>>>>>>> views,
>>>>>>>>> triggers, etc. We need some new syntax, it seems that the syntax of 
>>>>>>>>> pg is
>>>>>>>>> relatively comprehensive, it use the keyword INCLUDING/EXCLUDING to
>>>>>>>>> flexibly control the removal and retention of indexes, table 
>>>>>>>>> information,
>>>>>>>>> etc. see pg create table like
>>>>>>>>> <https://www.postgresql.org/docs/8.1/sql-createtable.html> , the
>>>>>>>>> new created index name is different from the original table's index 
>>>>>>>>> name ,
>>>>>>>>> seenewly copied index names are different from original
>>>>>>>>> <https://github.com/postgres/postgres/blob/master/doc/src/sgml/ref/create_table.sgml#L749>
>>>>>>>>> , the name is based on some rule.
>>>>>>>>> Mysql is relatively simple and copies columns and indexes by
>>>>>>>>> default. see mysql create table like
>>>>>>>>> <https://dev.mysql.com/doc/refman/8.4/en/create-table-like.html>
>>>>>>>>> and the newly created index name is the same with the original table's
>>>>>>>>> index name.
>>>>>>>>>
>>>>>>>>> So for Casandra, I hope it can also support the information copy
>>>>>>>>> of index and even view/trigger. And I also hope to be able to flexibly
>>>>>>>>> decide which information is copied like pg.
>>>>>>>>>
>>>>>>>>> Besides, I think the copy can happen between different keyspaces.
>>>>>>>>> And UDT needs to be taken into account.
>>>>>>>>>
>>>>>>>>> But as we know the index/view/trigger name are all under keyspace
>>>>>>>>> level, so it seems that the newly created index name (or view name/ 
>>>>>>>>> trigger
>>>>>>>>> name) must be different from the original tables' ,otherwise  names 
>>>>>>>>> would
>>>>>>>>> clash .
>>>>>>>>>
>>>>>>>>> So regarding the above problem, one idea I have is that for newly
>>>>>>>>> created types, indexes and views under different keyspaces and the 
>>>>>>>>> same
>>>>>>>>> keyspace, we first generate random names for them, and then we can 
>>>>>>>>> add the
>>>>>>>>> ability of modifying the names(for types/indexes/views/triggers) so 
>>>>>>>>> that
>>>>>>>>> users can manually change the names.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> guo Maxwell <[email protected]> 于2024年9月20日周五 08:06写道：
>>>>>>>>>
>>>>>>>>> No，I think still need some discuss on grammar detail after I
>>>>>>>>> finish the first version
>>>>>>>>>
>>>>>>>>> Patrick McFadin <[email protected]>于2024年9月20日 周五上午2:24写道：
>>>>>>>>>
>>>>>>>>> Is this CEP ready for a VOTE thread?
>>>>>>>>>
>>>>>>>>> On Sat, Aug 24, 2024 at 8:56 PM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Thank you for your replies, I will prepare a CEP later.
>>>>>>>>>
>>>>>>>>> Patrick McFadin <[email protected]> 于2024年8月20日周二 02:11写道：
>>>>>>>>>
>>>>>>>>> +1 This is a CEP
>>>>>>>>>
>>>>>>>>> On Mon, Aug 19, 2024 at 10:50 AM Jon Haddad <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Given the fairly large surface area for this, i think it should be
>>>>>>>>> a CEP.
>>>>>>>>>
>>>>>>>>> —
>>>>>>>>> Jon Haddad
>>>>>>>>> Rustyrazorblade Consulting
>>>>>>>>> rustyrazorblade.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Aug 19, 2024 at 10:44 AM Bernardo Botella <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Definitely a nice addition to CQL.
>>>>>>>>>
>>>>>>>>> Looking for inspiration at how Postgres and Mysql do that may also
>>>>>>>>> help with the final design (I like the WITH proposed by Stefan, but I 
>>>>>>>>> would
>>>>>>>>> definitely take a look at the INCLUDING keyword proposed by Postgres).
>>>>>>>>> https://www.postgresql.org/docs/current/sql-createtable.html
>>>>>>>>> https://dev.mysql.com/doc/refman/8.4/en/create-table-like.html
>>>>>>>>>
>>>>>>>>> On top of that, and as part of the interesting questions, I would
>>>>>>>>> like to add the permissions to the mix. Both the question about 
>>>>>>>>> copying
>>>>>>>>> them over (with a WITH keyword probably), and the need for read 
>>>>>>>>> permissions
>>>>>>>>> on the source table as well.
>>>>>>>>>
>>>>>>>>> Bernardo
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Aug 19, 2024, at 10:01 AM, Štefan Miklošovič <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> BTW this would be cool to do as well:
>>>>>>>>>
>>>>>>>>> ALTER TABLE ks.to_copy LIKE ks.tb WITH INDICES;
>>>>>>>>>
>>>>>>>>> This would mean that if we create a copy of a table, later we can
>>>>>>>>> decide that we need indices too, so we might "enrich" that table with
>>>>>>>>> indices from the old one without necessarily explicitly re-creating 
>>>>>>>>> them on
>>>>>>>>> that new table.
>>>>>>>>>
>>>>>>>>> On Mon, Aug 19, 2024 at 6:55 PM Štefan Miklošovič <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>> I think this is an interesting idea worth exploring. I definitely
>>>>>>>>> agree with Benjamin who raised important questions which needs to be
>>>>>>>>> answered first. Also, what about triggers?
>>>>>>>>>
>>>>>>>>> It might be rather "easy" to come up with something simple but it
>>>>>>>>> should be a comprehensive solution with predictable behavior we all 
>>>>>>>>> agree
>>>>>>>>> on.
>>>>>>>>>
>>>>>>>>> If a keyspace of a new table does not exist we would need to
>>>>>>>>> create that one too before. For the simplicity, I would just make it 
>>>>>>>>> a must
>>>>>>>>> to create it on same keyspace. We might iterate on that in the future.
>>>>>>>>>
>>>>>>>>> UDTs are created per keyspace so there is nothing to re-create. We
>>>>>>>>> just need to reference it from a new table, right?
>>>>>>>>>
>>>>>>>>> Indexes and MVs are interesting but in theory they might be
>>>>>>>>> re-created too.
>>>>>>>>>
>>>>>>>>> Would it be appropriate to use something like this?
>>>>>>>>>
>>>>>>>>> CREATE TABLE ks.tb_copy LIKE ks.tb WITH INDEXES AND VIEWS AND
>>>>>>>>> TRIGGERS ....
>>>>>>>>>
>>>>>>>>> Without "WITH" it would just copy a table with nothing else.
>>>>>>>>>
>>>>>>>>> On Mon, Aug 19, 2024 at 6:10 PM guo Maxwell <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello， everyone：
>>>>>>>>> As  Jira CASSANDRA-7662
>>>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-7662> has
>>>>>>>>> described , we would like to introduce a new grammer " CREATE TABLE 
>>>>>>>>> LIKE "
>>>>>>>>> ,which  simplifies creating new tables duplicating the existing ones .
>>>>>>>>> The format may be like : CREATE TABLE <new_table> LIKE <old_table>
>>>>>>>>> Before I implement this function, do you have any suggestions on
>>>>>>>>> this?
>>>>>>>>>
>>>>>>>>> Looking forward to your reply！
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> David A. Herrington II
>>>>>>>>> President and Chief Engineer
>>>>>>>>> RhinoSource, Inc.
>>>>>>>>>
>>>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>>>
>>>>>>>>> www.rhinosource.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> David A. Herrington II
>>>>>>>>> President and Chief Engineer
>>>>>>>>> RhinoSource, Inc.
>>>>>>>>>
>>>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>>>
>>>>>>>>> www.rhinosource.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> David A. Herrington II
>>>>>>>>> President and Chief Engineer
>>>>>>>>> RhinoSource, Inc.
>>>>>>>>>
>>>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>>>
>>>>>>>>> www.rhinosource.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -Dave
>>>>>>>>>
>>>>>>>>> David A. Herrington II
>>>>>>>>> President and Chief Engineer
>>>>>>>>> RhinoSource, Inc.
>>>>>>>>>
>>>>>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>>>>>
>>>>>>>>> www.rhinosource.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>

Re: [DISCUSS] Introduce CREATE TABLE LIKE grammer

Reply via email to