split_part for the last element
Hi, Suppose I need to split a string on a delimiter and select one of the resulting components. If I want a specific component counting from the start, that's easy: split_part('foo bar baz', ' ', 1) -> 'foo' But if I want the last component, I have several less-than-ideal options: 1. (string_to_array('foo bar baz', ' '))[cardinality(string_to_array('foo bar baz', ' ')) - 1] 2. reverse(split_part(reverse('foo bar baz'), ' ', 1)) 3. (regexp_match('foo baz bar', '\S*$'))[1] Option 1 is probably the most understandable, especially if you are willing to introduce a temporary parts array: select parts[cardinality(parts) - 1] from string_to_array('foo bar baz', ' ') parts But if the strings are long, this needlessly builds an array just to throw it away. Option 2 has similar efficiency problems and is just kind of silly. Option 3 is probably the best, but it's still a good bit more complicated than a simple split_part invocation. Is there another option I'm missing? Would there be interest in extending split part so that negative indices counted from the end, as in: split_part('foo bar baz', ' ', -1) -> 'baz' Or adding a split_part_end function in which positive indices counted from the end: split_part_end('foo bar baz', ' ', 1) -> 'baz' I'd be happy to prepare a patch if so. Cheers, Nikhil
Re: split_part for the last element
Right, that's option 2 in my original mail. There are several deficiencies with that idiom: * It is non-obvious. Sure, it might make sense to you and I, but to someone just learning SQL, it takes a minute to reason through why it works. They're also unlikely to invent the trick on their own. * It is inefficient. When the strings are large reversing the strings is a silly waste of compute. On Fri, Oct 23, 2020 at 12:03 PM PALAYRET Jacques wrote: > > Hello, > > reverse(split_part(reverse('foo bar baz'), ' ', 1)) -> 'baz' > > Regards > > - Mail original - > De: "Nikhil Benesch" > À: pgsql-general@lists.postgresql.org > Envoyé: Vendredi 23 Octobre 2020 17:47:16 > Objet: split_part for the last element > > Hi, > > Suppose I need to split a string on a delimiter and select one of the > resulting components. If I want a specific component counting from the > start, that's easy: > > split_part('foo bar baz', ' ', 1) -> 'foo' > > But if I want the last component, I have several less-than-ideal options: > > 1. (string_to_array('foo bar baz', ' > '))[cardinality(string_to_array('foo bar baz', ' ')) - 1] > 2. reverse(split_part(reverse('foo bar baz'), ' ', 1)) > 3. (regexp_match('foo baz bar', '\S*$'))[1] > > Option 1 is probably the most understandable, especially if you are > willing to introduce a temporary parts array: > > select parts[cardinality(parts) - 1] from string_to_array('foo bar > baz', ' ') parts > > But if the strings are long, this needlessly builds an array just to > throw it away. Option 2 has similar efficiency problems and is just > kind of silly. Option 3 is probably the best, but it's still a good > bit more complicated than a simple split_part invocation. > > Is there another option I'm missing? Would there be interest in > extending split part so that negative indices counted from the end, as > in: > > split_part('foo bar baz', ' ', -1) -> 'baz' > > Or adding a split_part_end function in which positive indices counted > from the end: > > split_part_end('foo bar baz', ' ', 1) -> 'baz' > > I'd be happy to prepare a patch if so. > > Cheers, > Nikhil > >
Re: split_part for the last element
On Fri, Oct 23, 2020 at 2:21 PM David G. Johnston wrote: > I'm torn here because this would be the first usage of this concept in > PostgreSQL (I think). Yeah, I also have some qualms about this design in the context of Postgres. Particularly because Postgres allows arrays to begin at negative indices. > Tangentially, I noticed that we have a "starts_with" function but no > corresponding "end_with". Ah, interesting. On the other hand, there are both "left" and "right", "lpad" and "rpad", and "ltrim" and "rtrim". And at least ends_with has the fairly elegant alternative of "s LIKE '%suffix'". > It's been a while but there used to be a systemic inertia working against > adding minor useful functions such as these. > > With the new documentation layout I would at least consider updating the > description for the normal functions with an example on how to formulate > an expression that works contra-normally, and in the case where there does > exist such a specialized function, naming it. Supposing you go this route, which of the options would you envision mentioning as the converse of split_part?
Re: split_part for the last element
On Fri, Oct 23, 2020 at 2:35 PM Tom Lane wrote: > > "David G. Johnston" writes: > > On Fri, Oct 23, 2020 at 8:47 AM Nikhil Benesch > > wrote: > >> Is there another option I'm missing? Would there be interest in > >> extending split part so that negative indices counted from the end, as > >> in: > >> split_part('foo bar baz', ' ', -1) -> 'baz' > > > I'm torn here because this would be the first usage of this concept in > > PostgreSQL (I think). > > We already have some JSON functions that act like that, not to mention > the left() and right() string functions, so I don't see that much of an > argument against extending split_part to do it. Oh, I didn't realize left and right already worked this way. That makes this design much more compelling, at least to me. If the consensus is this extension is ok, I'd be happy to prepare a patch.
Appetite for `SELECT ... EXCLUDE`?
Both DuckDB and Snowflake, as of recently, support a nonstandard `EXCLUDE` clause in the SELECT list to allow excluding fields from a wildcard [0] [1]. Example from the DuckDB announcement [2]: SELECT * EXCLUDE (jar_jar_binks, midichlorians) FROM star_wars Is there any appetite for adding this feature to PostgreSQL? It's quite a bit less typing when you're querying a relation with many columns and want to exclude only a few of those columns. Of course the downside is that it is not (AFAIK) in the SQL standard. I searched the archives and there are a few users asking about such a feature over the years. To be fully transparent, I'm asking in part on behalf of Materialize [3], where we try to follow PostgreSQL's syntax and semantics as closely as possible in our own SQL dialect. (We're happy to carry around extensions that PostgreSQL doesn't have, but our worst case scenario is that we eagerly implement an extension that PostgreSQL implements later in an incompatible way.) [0]: https://github.com/duckdb/duckdb/issues/2199 [1]: https://docs.snowflake.com/en/release-notes/2022-11.html#select-excluding-and-renaming-specific-columns [2]: https://duckdb.org/2022/05/04/friendlier-sql.html [3]: https://materialize.com
Re: Appetite for `SELECT ... EXCLUDE`?
Thanks for the pointers, Tom. Sorry my search didn't turn those up. On Fri, Nov 18, 2022 at 3:18 PM Tom Lane wrote: > > Nikhil Benesch writes: > > Both DuckDB and Snowflake, as of recently, support a nonstandard `EXCLUDE` > > clause in the SELECT list to allow excluding fields from a wildcard [0] [1]. > > > Example from the DuckDB announcement [2]: > > >SELECT * EXCLUDE (jar_jar_binks, midichlorians) FROM star_wars > > > Is there any appetite for adding this feature to PostgreSQL? > > This has been discussed before and not gone anywhere, e.g. [1] [2]. > I think there have been more threads but that was all I found in a > quick archive search. Anyway, as those threads illustrate, there is a > lot of room for variation in how you spell it, where you can write it, > and so on. My own inclination is to not do anything here until/unless > the SQL committee standardizes something, because there's too much > risk of finding ourselves incompatible with the standard. > > regards, tom lane > > [1] > https://www.postgresql.org/message-id/flat/d51371a2-f221-1cf3-4a7d-b2242d4dafdb%40gmail.com > [2] > https://www.postgresql.org/message-id/flat/CANcm6wbR3EG7t-G%3DTxy64Yt8nR6YbpzFRuTewJQ%2BkCq%3DrZ8M2A%40mail.gmail.com
DELETE ... USING LATERAL
Is it intentional that LATERAL elements in a USING clause of a DELETE statement can't reference the table declared in the FROM clause? Here's a somewhat contrived example. Suppose I have a table with one jsonb column: create table int_arrays (int_array jsonb); insert into int_arrays values ('[1]'), ('[1, 2]'), ('[3, 4, 5]'), ('[1, 1, 1]'); If I want to delete every row whose array contains a value greater than one, I would expect the following query to work: delete from int_arrays using jsonb_array_each(int_array) _ (val) where val::integer > 1; But that fails with: ERROR: invalid reference to FROM-clause entry for table "int_arrays" LINE 1: delete from int_arrays using jsonb_array_each(int_array) _ (... ^ HINT: There is an entry for table "int_arrays", but it cannot be referenced from this part of the query. So, ok, fine, the FROM and USING clauses are different scopes or something. Except that doesn't quite explain the situation, because you can't reuse the FROM table name in the USING clause: # delete from int_arrays using int_arrays; ERROR: table name "int_arrays" specified more than once Can someone shed some light on the situation here? Is there a reason that LATERAL elements in the USING clause must be prevented from accessing the FROM table or is the restriction just emergent behavior? Nikhil
Re: DELETE ... USING LATERAL
On Mon, Oct 4, 2021 at 1:48 PM Tom Lane wrote: > My mental model of these things is that the target table is cross-joined > to the additional tables as though by a comma in FROM [...] Mine as well. I just managed to dredge up some history here though. Turns out you explicitly disabled this feature for 9.4 to make room for a future feature to allow left-joining the target table [0]. Is support for that feature still desired/planned? (If it's been permanently abandoned for whatever reason, then maybe it's safe just to revert 158b7fa?) [0]: https://github.com/postgres/postgres/commit/158b7fa6a34006bdc70b515e14e120d3e896589b Nikhil
Re: DELETE ... USING LATERAL
On Mon, Oct 4, 2021 at 3:21 PM Tom Lane wrote: > Not sure what to tell you about the state of the idea that the > target table could be re-specified in FROM/USING. I'm hesitant > to close the door on it permanently, because people do periodically > wish to be able to left-join the target to something else. But > the fact that no one's done anything about it for years suggests > that it's not that high on anyone's wish list. Makes sense. Thanks for the insight. Sounds like the status quo is just fine. Nikhil