My inclination is not to support this until we support arbitrary numbers of IF statements. It’s one too many arbitrary restrictions and it potentially gets confusing.
But I don’t feel super strongly about it. > On 21 Sep 2022, at 20:56, Patrick McFadin <pmcfa...@gmail.com> wrote: > > > I'm also working on different use cases and syntax for Accord :D > > I'm +1 on this change and leaving the door open for maybe a few more as we > test this out. It needs to be functionally useful for developers in v1, and I > think it's worth the changes to get it right. > > One other thing Caleb and I have been discussing is how, when running a > transaction, the statement returns with no message. In CQLSH you have no idea > if anything happened unless you select from the tables and look for changes. > Even something like LWT adds with "applied=true|false" > > Patrick > >> On Wed, Sep 21, 2022 at 12:42 PM David Capwell <dcapw...@apple.com> wrote: >> Caleb is making great progress on this, and I have been working on CQL fuzz >> testing the new grammar to make sure we flesh out cases quickly; one thing >> we hit was about mixing conditional and non-conditional updates; will use a >> example to better show >> >> BEGIN TRANSACTION >> LET a = (SELECT * FROM ….); >> IF a IS NOT NULL THEN >> UPDATE …; >> END IF >> INSERT INTO ... >> COMMIT TRANSACTION >> >> In this case we have 1 UPDATE tied to the IF condition, and one INSERT that >> isn’t… for v1 do we need/want to support this, or is it best for v1 to be >> simple and have all updates tied to conditional when present? >> >>> On Aug 22, 2022, at 9:19 AM, Avi Kivity via dev <dev@cassandra.apache.org> >>> wrote: >>> >>> I wasn't referring to specific syntax but to the concept. If a SQL dialect >>> (or better, the standard) has a way to select data into a variable, let's >>> adopt it. >>> >>> If such syntax doesn't exist, LET (a, b, c) = (SELECT x, y, z FROM tab) is >>> my preference. >>> >>> On 8/22/22 19:13, Patrick McFadin wrote: >>>> The replies got trashed pretty badly in the responses. >>>> When you say: "Agree it's better to reuse existing syntax than invent new >>>> syntax." >>>> >>>> Which syntax are you referring to? >>>> >>>> Patrick >>>> >>>> >>>> On Mon, Aug 22, 2022 at 1:36 AM Avi Kivity via dev >>>> <dev@cassandra.apache.org> wrote: >>>>> Agree it's better to reuse existing syntax than invent new syntax. >>>>> >>>>> On 8/21/22 16:52, Konstantin Osipov wrote: >>>>> > * Avi Kivity via dev <dev@cassandra.apache.org> [22/08/14 15:59]: >>>>> > >>>>> > MySQL supports SELECT <expr_list> INTO <var_list> FROM ... WHERE >>>>> > ... >>>>> > >>>>> > PostgreSQL supports pretty much the same syntax. >>>>> > >>>>> > Maybe instead of LET use the ANSI/MySQL/PostgreSQL DECLARE var TYPE and >>>>> > MySQL/PostgreSQL SELECT ... INTO? >>>>> > >>>>> >> On 14/08/2022 01.29, Benedict Elliott Smith wrote: >>>>> >>> >>>>> >>> I’ll do my best to express with my thinking, as well as how I would >>>>> >>> explain the feature to a user. >>>>> >>> >>>>> >>> My mental model for LET statements is that they are simply SELECT >>>>> >>> statements where the columns that are selected become variables >>>>> >>> accessible anywhere in the scope of the transaction. That is to say, >>>>> >>> you >>>>> >>> should be able to run something like s/LET/SELECT and >>>>> >>> s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET statement >>>>> >>> and produce a valid SELECT statement, and vice versa. Both should >>>>> >>> perform identically. >>>>> >>> >>>>> >>> e.g. >>>>> >>> SELECT pk AS key, v AS value FROM table >>>>> >>> >>>>> >>> => >>>>> >>> LET key = pk, value = v FROM table >>>>> >> >>>>> >> "=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL >>>>> >> supports selecting comparisons: >>>>> >> >>>>> >> >>>>> >> $ psql >>>>> >> psql (14.3) >>>>> >> Type "help" for help. >>>>> >> >>>>> >> avi=# SELECT 1 = 2, 3 = 3, NULL = NULL; >>>>> >> ?column? | ?column? | ?column? >>>>> >> ----------+----------+---------- >>>>> >> f | t | >>>>> >> (1 row) >>>>> >> >>>>> >> >>>>> >> Using "=" as a syntactic element in LET would make SELECT and LET >>>>> >> incompatible once comparisons become valid selectors. Unless they >>>>> >> become >>>>> >> mandatory (and then you'd write "LET q = a = b" if you wanted to >>>>> >> select a >>>>> >> comparison). >>>>> >> >>>>> >> >>>>> >> I personally prefer the nested query syntax: >>>>> >> >>>>> >> >>>>> >> LET (a, b, c) = (SELECT foo, bar, x+y FROM ...); >>>>> >> >>>>> >> >>>>> >> So there aren't two similar-but-not-quite-the-same syntaxes. SELECT is >>>>> >> immediately recognizable by everyone as a query, LET is not. >>>>> >> >>>>> >> >>>>> >>> Identical form, identical behaviour. Every statement should be >>>>> >>> directly >>>>> >>> translatable with some simple text manipulation. >>>>> >>> >>>>> >>> We can then make this more powerful for users by simply expanding >>>>> >>> SELECT >>>>> >>> statements, e.g. by permitting them to declare constants and tuples in >>>>> >>> the column results. In this scheme LET x = * is simply syntactic sugar >>>>> >>> for LET x = (pk, ck, field1, …) This scheme then supports options 2, 4 >>>>> >>> and 5 all at once, consistently alongside each other. >>>>> >>> >>>>> >>> Option 6 is in fact very similar, but is strictly less flexible for >>>>> >>> the >>>>> >>> user as they have no way to declare multiple scalar variables without >>>>> >>> scoping them inside a tuple. >>>>> >>> >>>>> >>> e.g. >>>>> >>> LET key = pk, value = v FROM table >>>>> >>> IF key > 1 AND value > 1 THEN... >>>>> >>> >>>>> >>> => >>>>> >>> LET row = SELECT pk AS key, v AS value FROM table >>>>> >>> IF row.key > 1 AND row.value > 1 THEN… >>>>> >>> >>>>> >>> However, both are expressible in the existing proposal, as if you >>>>> >>> prefer >>>>> >>> this naming scheme you can simply write >>>>> >>> >>>>> >>> LET row = (pk AS key, v AS value) FROM table >>>>> >>> IF row.key > 1 AND row.value > 1 THEN… >>>>> >>> >>>>> >>> With respect to auto converting single column results to a scalar, we >>>>> >>> do >>>>> >>> need a way for the user to say they care whether the row was null or >>>>> >>> the >>>>> >>> column. I think an implicit conversion here could be surprising. >>>>> >>> However >>>>> >>> we could implement tuple expressions anyway and let the user >>>>> >>> explicitly >>>>> >>> declare v as a tuple as Caleb has suggested for the existing proposal >>>>> >>> as >>>>> >>> well. >>>>> >>> >>>>> >>> Assigning constants or other values not selected from a table would >>>>> >>> also >>>>> >>> be a little clunky: >>>>> >>> >>>>> >>> LET v1 = someFunc(), v2 = someOtherFunc(?) >>>>> >>> IF v1 > 1 AND v2 > 1 THEN… >>>>> >>> >>>>> >>> => >>>>> >>> LET row = SELECT someFunc() AS v1, someOtherFunc(?) AS v2 >>>>> >>> IF row.v1 > 1 AND row.v2 > 1 THEN... >>>>> >>> >>>>> >>> That said, the proposals are /close/ to identical, it is just slightly >>>>> >>> more verbose and slightly less flexible. >>>>> >>> >>>>> >>> Which one would be most intuitive to users is hard to predict. It >>>>> >>> might >>>>> >>> be that Option 6 would be slightly easier, but I’m unsure if there >>>>> >>> would >>>>> >>> be a huge difference. >>>>> >>> >>>>> >>> >>>>> >>>> On 13 Aug 2022, at 16:59, Patrick McFadin <pmcfa...@gmail.com> wrote: >>>>> >>>> >>>>> >>>> I'm really happy to see CEP-15 getting closer to a final >>>>> >>>> implementation. I'm going to walk through my reasoning for your >>>>> >>>> proposals wrt trying to explain this to somebody new. >>>>> >>>> >>>>> >>>> Looking at all the options, the first thing that comes up for me is >>>>> >>>> the Cassandra project's complicated relationship with NULL. We have >>>>> >>>> prior art with EXISTS/NOT EXISTS when creating new tables. IS >>>>> >>>> NULL/IS NOT NULL is used in materialized views similarly to >>>>> >>>> proposals 2,4 and 5. >>>>> >>>> >>>>> >>>> CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name >>>>> >>>> AS SELECT [ (column_list) ] >>>>> >>>> FROM [keyspace_name.]table_name >>>>> >>>> [ WHERE column_name IS NOT NULL >>>>> >>>> [ AND column_name IS NOT NULL ... ] ] >>>>> >>>> [ AND relation [ AND ... ] ] >>>>> >>>> PRIMARY KEY ( column_list ) >>>>> >>>> [ WITH [ table_properties ] >>>>> >>>> [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) >>>>> >>>> ] ] ; >>>>> >>>> >>>>> >>>> Based on that, I believe 1 and 3 would just confuse users, so -1 on >>>>> >>>> those. >>>>> >>>> >>>>> >>>> Trying to explain the difference between row and column operations >>>>> >>>> with LET, I can't see the difference between a row and column in #2. >>>>> >>>> >>>>> >>>> #4 introduces a boolean instead of column names and just adds more >>>>> >>>> syntax. >>>>> >>>> >>>>> >>>> #5 is verbose and, in my opinion, easier to reason when writing a >>>>> >>>> query. Thinking top down, I need to know if these exact rows and/or >>>>> >>>> column values exist before changing them, so I'll define them first. >>>>> >>>> Then I'll iterate over the state I created in my actual changes so I >>>>> >>>> know I'm changing precisely what I want. >>>>> >>>> >>>>> >>>> #5 could use a bit more to be clearer to somebody who doesn't write >>>>> >>>> CQL queries daily and wouldn't require memorizing subtle >>>>> >>>> differences. It should be similar to all the other syntax, so >>>>> >>>> learning a little about CQL will let you move into more without >>>>> >>>> completely re-learning the new syntax. >>>>> >>>> >>>>> >>>> So I propose #6) >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> LET row1 = SELECT * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all >>>>> >>>> columns >>>>> >>>> LET row2 = SELECT v FROM ks.tbl WHERE k=1 AND c=0; >>>>> >>>> SELECT row1, row2 >>>>> >>>> IF row1 IS NULL AND row2.v = 3 THEN >>>>> >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> >>>> END IF >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> I added the SELECT in the LET just so it's straightforward, you are >>>>> >>>> reading, and it's just like doing a regular select, but you are >>>>> >>>> assigning it to a variable. >>>>> >>>> >>>>> >>>> I removed the confusing 'row1.v'and replaced it with 'row1'I can't >>>>> >>>> see why you would need the '.v'vs having the complete variable I >>>>> >>>> created in the statement above. >>>>> >>>> >>>>> >>>> EOL >>>>> >>>> >>>>> >>>> Patrick >>>>> >>>> >>>>> >>>> On Thu, Aug 11, 2022 at 1:37 PM Caleb Rackliffe >>>>> >>>> <calebrackli...@gmail.com> wrote: >>>>> >>>> >>>>> >>>> ...and one more option... >>>>> >>>> >>>>> >>>> 5.) Introduce tuple assignments, removing all ambiguity around >>>>> >>>> row vs. column operations. >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> LET row1 = * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all >>>>> >>>> columns >>>>> >>>> LET row2 = (v) FROM ks.tbl WHERE k=1 AND c=0; >>>>> >>>> SELECT row1.v, row2.v >>>>> >>>> IF row1 IS NULL AND row2.v = 3 THEN >>>>> >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> >>>> END IF >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> On Thu, Aug 11, 2022 at 12:55 PM Caleb Rackliffe >>>>> >>>> <calebrackli...@gmail.com> wrote: >>>>> >>>> >>>>> >>>> via Benedict, here is a 4th option: >>>>> >>>> >>>>> >>>> 4.) Similar to #2, but don't rely on the key element being >>>>> >>>> NULL. >>>>> >>>> >>>>> >>>> If the read returns no result, x effectively becomes NULL. >>>>> >>>> Otherwise, it remains true/NOT NULL. >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> LET x = true FROM ks.tbl WHERE k=0 AND c=0; >>>>> >>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>> >>>> SELECT x, row2_v >>>>> >>>> IF x IS NULL AND row2_v = 3 THEN >>>>> >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> >>>> END IF >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe >>>>> >>>> <calebrackli...@gmail.com> wrote: >>>>> >>>> >>>>> >>>> Hello again everyone! >>>>> >>>> >>>>> >>>> I've been working on a prototype >>>>> >>>> <https://issues.apache.org/jira/browse/CASSANDRA-17719> >>>>> >>>> in >>>>> >>>> CASSANDRA-17719 for a grammar that roughly corresponds >>>>> >>>> to >>>>> >>>> what we've agreed on in this thread. One thing that >>>>> >>>> isn't >>>>> >>>> immediately obvious to me is how the LET syntax handles >>>>> >>>> cases where we want to check for the plain existence of >>>>> >>>> a >>>>> >>>> row in IF. For example, in this hybrid of the originally >>>>> >>>> proposed syntax and something more like what we've >>>>> >>>> agreed >>>>> >>>> on (and the RETURNING just to distinguish between that >>>>> >>>> and SELECT), this could be pretty straightforward: >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1; >>>>> >>>> SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2; >>>>> >>>> RETURNING row1.v, row2.v >>>>> >>>> IF row1 NOT EXISTS AND row2.v = 3 THEN >>>>> >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> >>>> END IF >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> The NOT EXISTS operator has row1 to work with. One the >>>>> >>>> other hand, w/ the LET syntax and no naming of reads, >>>>> >>>> it's not clear what the best solution would be. Here are >>>>> >>>> a few possibilities: >>>>> >>>> >>>>> >>>> 1.) Provide a few built-in functions that operate on a >>>>> >>>> whole result row. If we assume a SQL style IS NULL and >>>>> >>>> IS >>>>> >>>> NOT NULL (see my last post here) for operations on >>>>> >>>> particular columns, this probably eliminates the need >>>>> >>>> for >>>>> >>>> EXISTS/NOT EXISTS as well. >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> LET row1_missing = notExists() FROM ks.tbl WHERE k=0 >>>>> >>>> AND c=0; >>>>> >>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>> >>>> SELECT row1_missing, row2_v >>>>> >>>> IF row1_missing AND row2_v = 3 THEN >>>>> >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> >>>> END IF >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> 2.) Assign and check the first primary key element to >>>>> >>>> determine whether the row exists. >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0; >>>>> >>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>> >>>> SELECT row1_k, row2_v >>>>> >>>> IF row1_k IS NULL AND row2_v = 3 THEN >>>>> >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> >>>> END IF >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> 3.) Reconsider the LET concept toward something that >>>>> >>>> allows us to explicitly name our reads again. >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1; >>>>> >>>> WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2; >>>>> >>>> SELECT row1.v, row2.v >>>>> >>>> IF row1 NOT EXISTS AND row2.v = 3 THEN >>>>> >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> >>>> END IF >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> I don't have a strong affinity for any of these, >>>>> >>>> although >>>>> >>>> #1 seems the most awkward. >>>>> >>>> >>>>> >>>> Does anyone have any other alternatives? Preference for >>>>> >>>> one of the above options? >>>>> >>>> >>>>> >>>> Thanks! >>>>> >>>> >>>>> >>>> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe >>>>> >>>> <calebrackli...@gmail.com> wrote: >>>>> >>>> >>>>> >>>> Avi brought up an interesting point around NULLness >>>>> >>>> checking inCASSANDRA-17762 >>>>> >>>> >>>>> >>>> <https://issues.apache.org/jira/browse/CASSANDRA-17762>... >>>>> >>>> >>>>> >>>> In SQL, any comparison with NULL is NULL, which >>>>> >>>> is interpreted as FALSE in a condition. To test >>>>> >>>> for NULLness, you use IS NULL or IS NOT NULL. >>>>> >>>> But >>>>> >>>> LWT uses IF col = NULL as a NULLness test. This >>>>> >>>> is likely to confuse people coming from SQL and >>>>> >>>> hamper attempts to extend the dialect. >>>>> >>>> >>>>> >>>> >>>>> >>>> We can leave that Jira open to address what to do in >>>>> >>>> the legacy LWT case, but I'd support a SQL-congruent >>>>> >>>> syntax here (IS NULL or IS NOT NULL), where we have >>>>> >>>> something closer to a blank slate. >>>>> >>>> >>>>> >>>> Thoughts? >>>>> >>>> >>>>> >>>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky >>>>> >>>> <a...@aber.io> wrote: >>>>> >>>> >>>>> >>>> The new syntax looks great, and I’m really >>>>> >>>> excited to see this coming together. >>>>> >>>> >>>>> >>>> One piece of feedback on the proposed syntax is >>>>> >>>> around the use of “=“ as a declaration in >>>>> >>>> addition to its current use as an equality >>>>> >>>> operator in a WHERE clause and an assignment >>>>> >>>> operator in an UPDATE: >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> LET car_miles = miles_driven, >>>>> >>>> car_is_running = is_running FROM cars WHERE >>>>> >>>> model=’pinto’ >>>>> >>>> LET user_miles = miles_driven FROM users >>>>> >>>> WHERE name=’blake’ >>>>> >>>> SELECT something else from some other >>>>> >>>> table >>>>> >>>> IF NOT car_is_running THEN ABORT >>>>> >>>> UPDATE users SET miles_driven = user_miles >>>>> >>>> + 30 WHERE name='blake'; >>>>> >>>> UPDATE cars SET miles_driven = car_miles + >>>>> >>>> 30 WHERE model='pinto'; >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> This is supported in languages like PL/pgSQL, >>>>> >>>> but >>>>> >>>> in a normal SQL query kind of local declaration >>>>> >>>> is often expressed as an alias (SELECT col AS >>>>> >>>> new_col), subquery alias (SELECT col) t, or >>>>> >>>> common table expression (WITH t AS (SELECT >>>>> >>>> col)). >>>>> >>>> >>>>> >>>> Here’s an example of an alternative to the >>>>> >>>> proposed syntax that I’d find more readable: >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION >>>>> >>>> WITH car_miles, car_is_running AS (SELECT >>>>> >>>> miles_driven, is_running FROM cars WHERE >>>>> >>>> model=’pinto’), >>>>> >>>> user_miles AS (SELECT miles_driven FROM >>>>> >>>> users >>>>> >>>> WHERE name=’blake’) >>>>> >>>> IF NOT car_is_running THEN ABORT >>>>> >>>> UPDATE users SET miles_driven = user_miles >>>>> >>>> + 30 WHERE name='blake'; >>>>> >>>> UPDATE cars SET miles_driven = car_miles + >>>>> >>>> 30 >>>>> >>>> WHERE model='pinto'; >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> There’s also the option of naming the >>>>> >>>> transaction >>>>> >>>> like a subquery, and supporting LET via AS (this >>>>> >>>> one I’m less sure about but wanted to propose >>>>> >>>> anyway): >>>>> >>>> >>>>> >>>> BEGIN TRANSACTION t1 >>>>> >>>> SELECT miles_driven AS t1.car_miles, >>>>> >>>> is_running AS t1.car_is_running FROM cars >>>>> >>>> WHERE model=’pinto’; >>>>> >>>> SELECT miles_driven AS t1.user_miles FROM >>>>> >>>> users WHERE name=’blake’; >>>>> >>>> IF NOT car_is_running THEN ABORT >>>>> >>>> UPDATE users SET miles_driven = user_miles >>>>> >>>> + 30 WHERE name='blake'; >>>>> >>>> UPDATE cars SET miles_driven = car_miles + >>>>> >>>> 30 WHERE model='pinto'; >>>>> >>>> COMMIT TRANSACTION >>>>> >>>> >>>>> >>>> This also has the benefit of resolving ambiguity >>>>> >>>> in case of naming conflicts with existing (or >>>>> >>>> future) column names. >>>>> >>>> >>>>> >>>> -- >>>>> >>>> Abe >>>>> >>>> >>>>> >>> >>