> One way to make it obvious is to require the user to explicitly type the > SELECTs and then to require that all SELECTs appear before > UPDATE/INSERT/DELETE.
Yes, I agree that SELECT statements should be required to go first. However, I think this is sufficient and we can retain the shorter format for RETURNING. There only remains the issue of conditions imposed upon UPDATE/INSERT/DELETE statements when there are multiple statements that affect the same primary key. I think we can (and should) simply reject such queries for now, as it doesn’t make much sense to have multiple statements for the same primary key in the same transaction. > Returning the "result" from an UPDATE presents the question should it be the > data at the start of the transaction or end state? I am inclined to only return the new values (as proposed by Alex) for the purpose of returning new auto-increment values etc. If you require the prior value, SELECT is available to express this. > I was thinking the following coordinator-side implementation would allow to > use also old drivers I am inclined to return just the first result set to old clients. I think it’s fine to require a client upgrade to get multiple result sets. From: Henrik Ingo <henrik.i...@datastax.com> Date: Monday, 6 June 2022 at 15:18 To: dev@cassandra.apache.org <dev@cassandra.apache.org> Subject: Re: CEP-15 multi key transaction syntax Thank you Blake and team! Just some personal reactions and thoughts... First instinct is to support the shorter format where UPDATE ... AS car is also its own implicit select. However, a subtle thing to note is that a reasonable user might expect that in a sequence of multiple UPDATEs, each of them is also read at the position where the UPDATE is in the list of statements. The fact that Accord executes all reads first is not at all obvious from the syntax. One way to make it obvious is to require the user to explicitly type the SELECTs and then to require that all SELECTs appear before UPDATE/INSERT/DELETE. I like the idea of a RETURN or RETURNING keyword to specify what exactly you want to return. This would allow to also return results from UPDATE/INSERT since the user explicitly told us to do so. Returning the "result" from an UPDATE presents the question should it be the data at the start of the transaction or end state? Interestingly the MongoDB $findAndModify operation allows you to choose between both options. There seems to be a valid use case for both. The obvious examples are: UPDATE t SET c=100 WHERE id=1 AS t RETURNING BEFORE c; COMMIT TRANSACTION IF t.c <= 100; I want to know the value of what c was before I replaced with a new value. INSERT INTO t (c) VALUES (100) AS t RETURNING AFTER d; COMMIT TRANSACTION IF t.c <= 100; I want to know the defaulted value of d. (...as was already pointed out in another email.) UPDATE t SET c+=1 WHERE id=1 AS t RETURNING AFTER c; COMMIT TRANSACTION IF t.c <= 100; I want to know the result of c after the transaction. (Which I know will be at most 100, but I want to know exactly.) I kind of sympathize with the intuitive opinion that we should return the values from the start of the transaction, since that's how Accord works: reads first, updates second. Finally, I wanted to share a thought on how to implement the returning of multiple result sets. While you don't address it, I'm assuming the driver api will get new functionality where you can get a specific result set out of many. I was thinking the following coordinator-side implementation would allow to use also old drivers: BEGIN TRANSACTION; SELECT * FROM table1 WHERE .... AS t1; SELECT * FROM table2 WHERE .... AS t2; UPDATE something... COMMIT TRANSACTION; SELECT * FROM t1; SELECT * FROM t2; The coordinator-level implementation here would be to store the results of the SELECTs inside a transaction into temporary tables that the client can the read from after the transaction. Even if those later selects are outside the transaction, their contents would be a constant snapshot representing the state of those rows at the time of the transaction. The tables should be visible only to the same client session and until the start of the next transaction or a timeout, whichever comes first. henrik On Fri, Jun 3, 2022 at 6:39 PM Blake Eggleston <beggles...@apple.com<mailto:beggles...@apple.com>> wrote: Hi dev@, I’ve been working on a draft syntax for Accord transactions and wanted to bring what I have to the dev list to solicit feedback and build consensus before moving forward with it. The proposed transaction syntax is intended to be an extended batch syntax. Basically batches with selects, and an optional condition at the end. To facilitate conditions against an arbitrary number of select statements, you can also name the statements, and reference columns in the results. To cut down on the number of operations needed, select values can also be used in updates, including some math operations. Parameterization of literals is supported the same as other statements. Here's an example selecting a row from 2 tables, and issuing updates for each row if a condition is met: BEGIN TRANSACTION; SELECT * FROM users WHERE name='blake' AS user; SELECT * from cars WHERE model='pinto' AS car; UPDATE users SET miles_driven = user.miles_driven + 30 WHERE name='blake'; UPDATE cars SET miles_driven = car.miles_driven + 30 WHERE model='pinto'; COMMIT TRANSACTION IF car.is_running; This can be simplified by naming the updates with an AS <name> syntax. If updates are named, a corresponding read is generated behind the scenes and its values inform the update. Here's an example, the query is functionally identical to the previous query. In the case of the user update, a read is still performed behind the scenes to enable the calculation of miles_driven + 30, but doesn't need to be named since it's not referenced anywhere else. BEGIN TRANSACTION; UPDATE users SET miles_driven += 30 WHERE name='blake'; UPDATE cars SET miles_driven += 30 WHERE model='pinto' AS car; COMMIT TRANSACTION IF car.is_running; Here’s another example, performing the canonical bank transfer: BEGIN TRANSACTION; UPDATE accounts SET balance += 100 WHERE name='blake' AS blake; UPDATE accounts SET balance -= 100 WHERE name='benedict' AS benedict; COMMIT TRANSACTION IF blake EXISTS AND benedict.balance >= 100; As you can see from the examples, column values can be referenced via a dot syntax, ie: <select_name>.<column> -> select1.value. Since the read portion of the transaction is performed before evaluating conditions or applying updates, values read can be freely applied to non-primary key values in updates. Select statements used either in checking a condition or creating an update must be restricted to a single row, either by specifying the full primary key or a limit of 1. Multi-row selects are allowed, but only for returning data to the client (see below). For evaluating conditions, = & != are available for all types, <, <=, >, >= are available for numerical types, and EXISTS, NOT EXISTS can be used for partitions, rows, and values. If any column references cannot be satisfied by the result of the reads, the condition implicitly fails. This prevents having to include a bunch of exists statements. On completion, an operation would return a boolean value indicating the operation had been applied, and a result set for each named select (but not named update). We could also support an optional RETURN keyword, which would allow the user to only return specific named selects (ie: RETURN select1, select2). Let me know what you think! Blake -- Henrik Ingo +358 40 569 7354<tel:358405697354> [Visit us online.]<https://www.datastax.com/> [Visit us on Twitter.] <https://twitter.com/DataStaxEng> [Visit us on YouTube.] <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=> [Visit my LinkedIn profile.] <https://www.linkedin.com/in/heingo/>