split_part for the last element

2020-10-23 Thread Nikhil Benesch
Hi,

Suppose I need to split a string on a delimiter and select one of the
resulting components. If I want a specific component counting from the
start, that's easy:

split_part('foo bar baz', ' ', 1) -> 'foo'

But if I want the last component, I have several less-than-ideal options:

1. (string_to_array('foo bar baz', '
'))[cardinality(string_to_array('foo bar baz', ' ')) - 1]
2. reverse(split_part(reverse('foo bar baz'), ' ', 1))
3. (regexp_match('foo baz bar', '\S*$'))[1]

Option 1 is probably the most understandable, especially if you are
willing to introduce a temporary parts array:

select parts[cardinality(parts) - 1] from string_to_array('foo bar
baz', ' ') parts

But if the strings are long, this needlessly builds an array just to
throw it away. Option 2 has similar efficiency problems and is just
kind of silly. Option 3 is probably the best, but it's still a good
bit more complicated than a simple split_part invocation.

Is there another option I'm missing? Would there be interest in
extending split part so that negative indices counted from the end, as
in:

split_part('foo bar baz', ' ', -1) -> 'baz'

Or adding a split_part_end function in which positive indices counted
from the end:

split_part_end('foo bar baz', ' ', 1) -> 'baz'

I'd be happy to prepare a patch if so.

Cheers,
Nikhil




Re: split_part for the last element

2020-10-23 Thread Nikhil Benesch
Right, that's option 2 in my original mail. There are several
deficiencies with that idiom:

  * It is non-obvious. Sure, it might make sense to you and I, but to
someone just learning SQL, it takes a minute to reason through why it
works. They're also unlikely to invent the trick on their own.
  * It is inefficient. When the strings are large reversing the
strings is a silly waste of compute.

On Fri, Oct 23, 2020 at 12:03 PM PALAYRET Jacques
 wrote:
>
> Hello,
>
>   reverse(split_part(reverse('foo bar baz'), ' ', 1))   -> 'baz'
>
> Regards
>
> - Mail original -
> De: "Nikhil Benesch" 
> À: pgsql-general@lists.postgresql.org
> Envoyé: Vendredi 23 Octobre 2020 17:47:16
> Objet: split_part for the last element
>
> Hi,
>
> Suppose I need to split a string on a delimiter and select one of the
> resulting components. If I want a specific component counting from the
> start, that's easy:
>
> split_part('foo bar baz', ' ', 1) -> 'foo'
>
> But if I want the last component, I have several less-than-ideal options:
>
> 1. (string_to_array('foo bar baz', '
> '))[cardinality(string_to_array('foo bar baz', ' ')) - 1]
> 2. reverse(split_part(reverse('foo bar baz'), ' ', 1))
> 3. (regexp_match('foo baz bar', '\S*$'))[1]
>
> Option 1 is probably the most understandable, especially if you are
> willing to introduce a temporary parts array:
>
> select parts[cardinality(parts) - 1] from string_to_array('foo bar
> baz', ' ') parts
>
> But if the strings are long, this needlessly builds an array just to
> throw it away. Option 2 has similar efficiency problems and is just
> kind of silly. Option 3 is probably the best, but it's still a good
> bit more complicated than a simple split_part invocation.
>
> Is there another option I'm missing? Would there be interest in
> extending split part so that negative indices counted from the end, as
> in:
>
> split_part('foo bar baz', ' ', -1) -> 'baz'
>
> Or adding a split_part_end function in which positive indices counted
> from the end:
>
> split_part_end('foo bar baz', ' ', 1) -> 'baz'
>
> I'd be happy to prepare a patch if so.
>
> Cheers,
> Nikhil
>
>




Re: split_part for the last element

2020-10-23 Thread Nikhil Benesch
On Fri, Oct 23, 2020 at 2:21 PM David G. Johnston
 wrote:

> I'm torn here because this would be the first usage of this concept in
> PostgreSQL (I think).

Yeah, I also have some qualms about this design in the context of Postgres.
Particularly because Postgres allows arrays to begin at negative indices.

> Tangentially, I noticed that we have a "starts_with" function but no
> corresponding "end_with".

Ah, interesting. On the other hand, there are both "left" and "right",
"lpad" and "rpad", and "ltrim" and "rtrim". And at least ends_with has the
fairly elegant alternative of "s LIKE '%suffix'".

> It's been a while but there used to be a systemic inertia working against
> adding minor useful functions such as these.
>
> With the new documentation layout I would at least consider updating the
> description for the normal functions with an example on how to formulate
> an expression that works contra-normally, and in the case where there does
> exist such a specialized function, naming it.

Supposing you go this route, which of the options would you envision
mentioning as the converse of split_part?




Re: split_part for the last element

2020-10-23 Thread Nikhil Benesch
On Fri, Oct 23, 2020 at 2:35 PM Tom Lane  wrote:
>
> "David G. Johnston"  writes:
> > On Fri, Oct 23, 2020 at 8:47 AM Nikhil Benesch 
> > wrote:
> >> Is there another option I'm missing? Would there be interest in
> >> extending split part so that negative indices counted from the end, as
> >> in:
> >>  split_part('foo bar baz', ' ', -1) -> 'baz'
>
> > I'm torn here because this would be the first usage of this concept in
> > PostgreSQL (I think).
>
> We already have some JSON functions that act like that, not to mention
> the left() and right() string functions, so I don't see that much of an
> argument against extending split_part to do it.

Oh, I didn't realize left and right already worked this way. That
makes this design much more compelling, at least to me.

If the consensus is this extension is ok, I'd be happy to prepare a patch.




Appetite for `SELECT ... EXCLUDE`?

2022-11-18 Thread Nikhil Benesch
Both DuckDB and Snowflake, as of recently, support a nonstandard `EXCLUDE`
clause in the SELECT list to allow excluding fields from a wildcard [0] [1].

Example from the DuckDB announcement [2]:

   SELECT * EXCLUDE (jar_jar_binks, midichlorians) FROM star_wars

Is there any appetite for adding this feature to PostgreSQL? It's quite a bit
less typing when you're querying a relation with many columns and want to
exclude only a few of those columns. Of course the downside is that it is not
(AFAIK) in the SQL standard. I searched the archives and there are a few users
asking about such a feature over the years.

To be fully transparent, I'm asking in part on behalf of Materialize [3], where
we try to follow PostgreSQL's syntax and semantics as closely as possible in our
own SQL dialect. (We're happy to carry around extensions that PostgreSQL
doesn't have, but our worst case scenario is that we eagerly implement an
extension that PostgreSQL implements later in an incompatible way.)

[0]: https://github.com/duckdb/duckdb/issues/2199
[1]: 
https://docs.snowflake.com/en/release-notes/2022-11.html#select-excluding-and-renaming-specific-columns
[2]: https://duckdb.org/2022/05/04/friendlier-sql.html
[3]: https://materialize.com




Re: Appetite for `SELECT ... EXCLUDE`?

2022-11-18 Thread Nikhil Benesch
Thanks for the pointers, Tom. Sorry my search didn't turn those up.

On Fri, Nov 18, 2022 at 3:18 PM Tom Lane  wrote:
>
> Nikhil Benesch  writes:
> > Both DuckDB and Snowflake, as of recently, support a nonstandard `EXCLUDE`
> > clause in the SELECT list to allow excluding fields from a wildcard [0] [1].
>
> > Example from the DuckDB announcement [2]:
>
> >SELECT * EXCLUDE (jar_jar_binks, midichlorians) FROM star_wars
>
> > Is there any appetite for adding this feature to PostgreSQL?
>
> This has been discussed before and not gone anywhere, e.g. [1] [2].
> I think there have been more threads but that was all I found in a
> quick archive search.  Anyway, as those threads illustrate, there is a
> lot of room for variation in how you spell it, where you can write it,
> and so on.  My own inclination is to not do anything here until/unless
> the SQL committee standardizes something, because there's too much
> risk of finding ourselves incompatible with the standard.
>
> regards, tom lane
>
> [1] 
> https://www.postgresql.org/message-id/flat/d51371a2-f221-1cf3-4a7d-b2242d4dafdb%40gmail.com
> [2] 
> https://www.postgresql.org/message-id/flat/CANcm6wbR3EG7t-G%3DTxy64Yt8nR6YbpzFRuTewJQ%2BkCq%3DrZ8M2A%40mail.gmail.com




DELETE ... USING LATERAL

2021-10-04 Thread Nikhil Benesch
Is it intentional that LATERAL elements in a USING clause of a DELETE
statement can't reference the table declared in the FROM clause?

Here's a somewhat contrived example. Suppose I have a table with one
jsonb column:

create table int_arrays (int_array jsonb);
insert into int_arrays values ('[1]'), ('[1, 2]'), ('[3, 4, 5]'),
('[1, 1, 1]');

If I want to delete every row whose array contains a value greater
than one, I would expect the following query to work:

delete from int_arrays using jsonb_array_each(int_array) _ (val)
where val::integer > 1;

But that fails with:

ERROR:  invalid reference to FROM-clause entry for table "int_arrays"
LINE 1: delete from int_arrays using jsonb_array_each(int_array) _ (...
  ^
HINT:  There is an entry for table "int_arrays", but it cannot be
referenced from this part of the query.

So, ok, fine, the FROM and USING clauses are different scopes or
something. Except that doesn't quite explain the situation, because
you can't reuse the FROM table name in the USING clause:

# delete from int_arrays using int_arrays;
ERROR:  table name "int_arrays" specified more than once

Can someone shed some light on the situation here? Is there a reason
that LATERAL elements in the USING clause must be prevented from
accessing the FROM table or is the restriction just emergent behavior?

Nikhil




Re: DELETE ... USING LATERAL

2021-10-04 Thread Nikhil Benesch
On Mon, Oct 4, 2021 at 1:48 PM Tom Lane  wrote:
> My mental model of these things is that the target table is cross-joined
> to the additional tables as though by a comma in FROM [...]

Mine as well.

I just managed to dredge up some history here though. Turns out you
explicitly disabled this feature for 9.4 to make room for a future
feature to allow left-joining the target table [0]. Is support for
that feature still desired/planned? (If it's been permanently
abandoned for whatever reason, then maybe it's safe just to revert
158b7fa?)

[0]: 
https://github.com/postgres/postgres/commit/158b7fa6a34006bdc70b515e14e120d3e896589b

Nikhil




Re: DELETE ... USING LATERAL

2021-10-07 Thread Nikhil Benesch
On Mon, Oct 4, 2021 at 3:21 PM Tom Lane  wrote:
> Not sure what to tell you about the state of the idea that the
> target table could be re-specified in FROM/USING.  I'm hesitant
> to close the door on it permanently, because people do periodically
> wish to be able to left-join the target to something else.  But
> the fact that no one's done anything about it for years suggests
> that it's not that high on anyone's wish list.

Makes sense. Thanks for the insight. Sounds like the status quo is just fine.

Nikhil