Re: Why Postgres doesn't use TID scan?

2018-12-19 Thread Andrew Gierth
> "Vladimir" == Vladimir Ryabtsev  writes:

 Vladimir> I can't believe it.
 Vladimir> I see some recommendations in Internet to do like this

well, 90% of what you read on the Internet is wrong.

 Vladimir> Did it really work in 2011? Are you saying they broke it?
 Vladimir> It's a shame...

The method in that SO link does work, it's just slow. The workaround is
to do it like this instead:

delete from mytable
 where ctid = any (array(select ctid from mytable
  where ...
  order by ...
  limit 1000));

But of course that's still an ugly hack.

-- 
Andrew (irc:RhodiumToad)



Re: Why Postgres doesn't use TID scan?

2018-12-19 Thread Vladimir Ryabtsev
> The workaround is to do it like this instead:

Strange, I tried to do like this, but the first thing came into my
mind was array_agg()
not array():

delete from log
where ctid = any(
select array_agg(ctid) from (
select ctid from log
where timestamp < now() at time zone 'pst' - interval '2 month'
limit 10
) v);

This query complained like this:

ERROR: operator does not exist: tid = tid[]
LINE 2: where ctid = any(
   ^
HINT: No operator matches the given name and argument type(s). You might
need to add explicit type casts.

Which is strange because both array(select ...) and select array_agg() ...
return the same datatype ctid[].

> But of course that's still an ugly hack.

Come on... Due to declarative nature of SQL developers sometimes need to
write much dirtier and uglier hacks.
This one is just a fluffy hacky.

--
Vlad


Re: Why Postgres doesn't use TID scan?

2018-12-19 Thread Andrew Gierth
> "Vladimir" == Vladimir Ryabtsev  writes:

 >> The workaround is to do it like this instead:

 Vladimir> Strange, I tried to do like this, but the first thing came
 Vladimir> into my mind was array_agg() not array():

 Vladimir> delete from log
 Vladimir> where ctid = any(
 Vladimir> select array_agg(ctid) from (
 Vladimir> select ctid from log
 Vladimir> where timestamp < now() at time zone 'pst' - interval '2 
month'
 Vladimir> limit 10
 Vladimir> ) v);

 Vladimir> This query complained like this:

 Vladimir> ERROR: operator does not exist: tid = tid[]
 Vladimir> LINE 2: where ctid = any(
 Vladimir>^
 Vladimir> HINT: No operator matches the given name and argument
 Vladimir> type(s). You might need to add explicit type casts.

 Vladimir> Which is strange because both array(select ...) and select
 Vladimir> array_agg() ... return the same datatype ctid[].

It's not so strange when you understand what's going on here. The
fundamental issue is that "ANY" has two meanings in PG, one of them
following the SQL standard and one not:

  x  ANY ()   -- standard
  x  ANY ()  -- PG-specific

In the first case, the behavior follows the standard, which makes this a
generalization of IN: specifically, in the standard,

  x IN (select ...)

is just alternative syntax for

  x = ANY (select ...)

Obviously in this form, the result of the subselect is expected to be of
the same type and degree as "x", hence the error since tid and tid[] are
not the same type.

(Because this is the standard form, it's the one chosen when the syntax
is otherwise ambiguous between the two.)

The form x = ANY (somearray) is a PG extension, but because of the
ambiguity, the array can only be specified by something that doesn't
parse as a select. So array() works (as does array[] for the commonly
used case of an explicit list), but if you want to use a select to get
the array value, you have to add some kind of syntax that makes it not
parse as a select, e.g.:

  WHERE ctid = ANY ((select array_agg(...) from ...)::tid[])

In this case the cast forces it to parse as an expression and not a
subquery (it's not enough to just use the parens alone, because PG,
again unlike the SQL standard, allows any number of excess parens around
a subquery).

-- 
Andrew (irc:RhodiumToad)



Re: Why Postgres doesn't use TID scan?

2018-12-19 Thread Vladimir Ryabtsev
> The fundamental issue is that "ANY" has two meanings in PG, one of them
following the SQL standard and one not:

Oh yes, I was aware about two forms but it did not come into my mind, I was
thinking I use the same form in both cases since my query returns only one
row and column.
Thanks for pointing me into that.

--
Vlad