Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)y

2023-03-03 Thread Alban Hertroys



> On 3 Mar 2023, at 0:02, Thorsten Glaser  wrote:
> 
> On Tue, 28 Feb 2023, Alban Hertroys wrote:
> 
>> Perhaps you can use a lateral cross join to get the result of
>> jsonb_build_object as a jsonb value to pass around?
> 
> I don’t see how. (But then I’ve not yet worked with lateral JOINs.)

You posted this bit:

> jsonb_build_object('opening_times',
>   jsonb_agg(DISTINCT jsonb_build_object(
>   'weekday', cot.weekday,
>   'from_hour', cot.from_hour,
>   'to_hour', cot.to_hour)
>   ORDER BY
>   jsonb_build_object(
>   'weekday', cot.weekday,
>   'from_hour', cot.from_hour,
>   'to_hour', cot.to_hour)->>'weekday',
>   jsonb_build_object(
>   'weekday', cot.weekday,
>   'from_hour', cot.from_hour,
>   'to_hour', cot.to_hour)->>'from_hour',
>   jsonb_build_object(
>   'weekday', cot.weekday,
>   'from_hour', cot.from_hour,
>   'to_hour', cot.to_hour)->>'to_hour')
> )


You can rewrite that into something like this:

select jsonb_build_object('opening_times’,
obj
ORDER BY
obj->>'weekday’,
obj->>'from_hour’,
obj->>'to_hour')
)
from cot
cross join lateral jsonb_agg(jsonb_build_object(
'weekday', cot.weekday,
'from_hour', cot.from_hour,
'to_hour', cot.to_hour) obj

That’s off the top of my head and I did leave out the DISTINCT. Just to show 
the concept here. A bit of experimenting and reading should get you there, I’m 
keeping $work waiting :P

(…)

> WITH
>cgwaj AS (
>   SELECT cgwa.id AS id, jsonb_build_object(
>   'weekday', cgwa.weekday,
>   'forenoon', cgwa.forenoon,
>   'afternoon', cgwa.afternoon,
>   'evening', cgwa.evening) AS obj
>   FROM core_generalworkavailability cgwa
>),
>-- … same for opening times
> SELECT cp.email, …,
>   -- …
>   jsonb_build_object('possible_work_times', COALESCE(
>   jsonb_agg(DISTINCT cgwaj.obj ORDER BY cgwaj.obj->>'weekday',
>   cgwaj.obj->>'forenoon', cgwaj.obj->>'afternoon',
>   cgwaj.obj->>'evening')
>   FILTER (WHERE cgwaj.id IS NOT NULL))) ||
>   -- …
>FROM core_person cp
>   -- …
>   LEFT JOIN core_person_possible_work_times cppwt ON cppwt.person_id=cp.id
>   LEFT JOIN cgwaj ON cgwaj.id=cppwt.generalworkavailability_id
>   -- …
> 
> That is, add a CTE for each m:n-attached table whose “value” is
> an object, not a single field, keep the id field; LEFT JOIN that
> (instead of the original table), then we have a field to use in
> ORDER BY.
> 
> I think. I’ve not yet tried it (I don’t have access to that DB
> normally, I was just helping out).
> 
> This avoids sub-SELECTs in the sense of needing to run one for
> each user row, because the innermost JSON object building needs
> to be done for each (connected (if the query is not filtering on
> specific users)) row of the “property table”, anyway. (And even
> if filtered, that can be passed down.)
> 
> bye,
> //mirabilos
> -- 
> Solange man keine schmutzigen Tricks macht, und ich meine *wirklich*
> schmutzige Tricks, wie bei einer doppelt verketteten Liste beide
> Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz
> hervorragend. -- Andreas Bogk über boehm-gc in d.a.s.r
> 

Alban Hertroys
--
There is always an exception to always.








Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?

2023-03-03 Thread cen

Likely to be safe, you'd just include both. The problem is that the
query planner makes use of equivalence classes to deduce equivalence
in quals.

If you have a query such as:

select * from t1 inner join t2 on t1.x = t2.y where t1.x = 3;

then the planner can deduce that t2.y must also be 3 and that qual can
be pushed down to the scan level. If t2.y = 3 is quite selective and
there's an index on that column, then this deduction is likely going
to be a very good win, as the alternative of not using it requires
looking at all rows in t2.


Does equivalency only work for constants as in the sample you provided 
or will it also be found in b1."number" and t0."block_number" in my 
sample query?


Meaning the columns could be used interchangeably in all the WHERE 
clauses and the ORDER clause, then it is a matter of figuring out what 
costs less.








Dropping behavior for unique CONSTRAINTs

2023-03-03 Thread Conner Bean

Hi folks,I'm curious if there are any docs supporting the functionality behind dropping 
unique constraints. For context, I am interested in enforcing uniqueness on a column. 
This table is heavily used, and I plan on dropping the constraint in the future. I wanted 
to avoid using a unique index since dropping them requires an exclusive lock and cannot 
be done concurrently. My thought was to then use a unique constraint, since I've read 
unofficial docs[0] that say these can be dropped safely with no lock.However, since a 
unique index would be the backing index to the unique constraint, I'm curious how this 
would work in practice (or if it even does!). Specifically if my theory of: "I can 
use a unique constraint to avoid obtaining an exclusive lock on my table" actually 
works.Thanks for any and all information, it's greatly appreciated, and apologies if I 
missed any existing documentation.Best, CSB0: 
https://leopard.in.ua/2016/09/20/safe-and-unsafe-operations-postgresql#.ZAEFpezMJLR 

Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?

2023-03-03 Thread David Rowley
On Fri, 3 Mar 2023 at 22:35, cen  wrote:
> Does equivalency only work for constants as in the sample you provided
> or will it also be found in b1."number" and t0."block_number" in my
> sample query?

It works for more than constants, but in this case, it's the presence
of the constant that would allow the qual to be pushed down into the
scan level of the other relation.

For cases such as t1 INNER JOIN t2 ON t1.x = t2.y INNER JOIN t3 ON
t2.y = t3.z the equivalence classes could allow t1 to be joined to t3
using t1.x = t3.z before t2 is joined in, so certainly it still does
things with classes not containing constants. No derived quals will
get pushed down to the scan level without constants, however.

> Meaning the columns could be used interchangeably in all the WHERE
> clauses and the ORDER clause, then it is a matter of figuring out what
> costs less.

The problem is you have:   AND (t0."block_number" >= 30926000) AND
(t0."block_number" <= 31957494).  The >= and <= operators are not used
to help build the equivalence class.  You'd see a very different plan
had you just been asking for block_number = 30926000.

I think your best solution will be to just also include the seemingly
surplus: AND (b1."number" >= 30926000)  AND (b1."number" <= 31957494)
quals.

David




PG16devel - vacuum_freeze_table_age seems not being taken into account

2023-03-03 Thread Simon Elbaz
Hi List,

I am doing some tests to understand vacuum_freeze_table_age and
vacuum_freeze_min_age parameters.

Here is my configuration:

postgres=# select name, setting from pg_settings where name =
'vacuum_freeze_min_age';
 name  | setting
---+-
 vacuum_freeze_min_age | 50
(1 ligne)

postgres=# select name, setting from pg_settings where name =
'vacuum_freeze_table_age';
  name   | setting
-+-
 vacuum_freeze_table_age | 150
(1 ligne)

test table has an age of 51.

hydrodb=# SELECT c.oid::regclass as table_name,
   greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm') and c.relname='test';
 table_name | age
+-
 test   |  51
(1 ligne)

I expected it not to be processed by vacuum freeze.
However it has been entirely frozen.
Moreover, among the 51 rows, only 1 was eligible for freeze because its XID
was older than vacuum_freeze_min_age.

hydrodb=# vacuum  verbose test;
INFO:  vacuuming "hydrodb.public.test"
INFO:  finished vacuuming "hydrodb.public.test": index scans: 0
pages: 0 removed, 447 remain, 1 scanned (0.22% of total)
tuples: 0 removed, 100595 remain, 0 are dead but not yet removable
removable cutoff: 1569, which was 0 XIDs old when operation ended
*new relfrozenxid: 1569, which is 51 XIDs ahead of previous value*
frozen: 1 pages from table (0.22% of total) had 51 tuples frozen
parcours d'index non nécessaire : 0 blocs de la table (0.00% au total) ont
0 versions mortes de lignes supprimées
vitesse moyenne de lecture : 0.000 Mo/s, vitesse moyenne d'écriture :
58.302 Mo/s
utilisation du cache : 8 récupérés, 0 ratés, 1 modifiés
utilisation des WAL : 3 enregistrements, 1 images complètes de blocs, 6302
octets
utilisation du système : CPU : utilisateur : 0.00 s, système : 0.00 s,
temps passé : 0.00 s
VACUUM

Thanks for any explanation
Simon


Re: Dropping behavior for unique CONSTRAINTs

2023-03-03 Thread David Rowley
On Fri, 3 Mar 2023 at 23:17, Conner Bean  wrote:
I wanted to avoid using a unique index since dropping them requires an
exclusive lock and cannot be done concurrently. My thought was to then
use a unique constraint, since I've read unofficial docs[0] that say
these can be dropped safely with no lock.

You should try the official documents.  You won't find any wording in
those that say that a unique constraint can be dropped without any
locking.

If you look at https://www.postgresql.org/docs/current/sql-altertable.html
you'll see "Note that the lock level required may differ for each
subform. An ACCESS EXCLUSIVE lock is acquired unless explicitly
noted.", and if you look at DROP CONSTRAINT that it mentions nothing
about any lower-level locks, so you can assume that DROP CONSTRAINT
obtains an access exclusive lock on the table being altered.

If you have a look at
https://www.postgresql.org/docs/15/sql-dropindex.html check out the
CONCURRENTLY option. That option allows an index to be dropped without
blocking concurrent reads and writes to the table. It seems like just
having a unique index without the constraint is likely your best bet
if you can't afford to block any traffic for the brief moment it would
take to drop the constraint.

David




Re: PG16devel - vacuum_freeze_table_age seems not being taken into account

2023-03-03 Thread David Rowley
On Fri, 3 Mar 2023 at 23:43, Simon Elbaz  wrote:
> hydrodb=# SELECT c.oid::regclass as table_name,
>greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
> FROM pg_class c
> LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
> WHERE c.relkind IN ('r', 'm') and c.relname='test';
>  table_name | age
> +-
>  test   |  51
> (1 ligne)
>
> I expected it not to be processed by vacuum freeze.
> However it has been entirely frozen.

You may have missed the wording in the docs about the FREEZE option.
"Specifying FREEZE is equivalent to performing VACUUM with the
vacuum_freeze_min_age and vacuum_freeze_table_age parameters set to
zero." [0]

David

[0] https://www.postgresql.org/docs/current/sql-vacuum.html




Re: PG16devel - vacuum_freeze_table_age seems not being taken into account

2023-03-03 Thread Simon Elbaz
I ran vacuum without the freeze option as you can see below.

Simon

On Fri, Mar 3, 2023 at 12:01 PM David Rowley  wrote:

> On Fri, 3 Mar 2023 at 23:43, Simon Elbaz  wrote:
> > hydrodb=# SELECT c.oid::regclass as table_name,
> >greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
> > FROM pg_class c
> > LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
> > WHERE c.relkind IN ('r', 'm') and c.relname='test';
> >  table_name | age
> > +-
> >  test   |  51
> > (1 ligne)
> >
> > I expected it not to be processed by vacuum freeze.
> > However it has been entirely frozen.
>
> You may have missed the wording in the docs about the FREEZE option.
> "Specifying FREEZE is equivalent to performing VACUUM with the
> vacuum_freeze_min_age and vacuum_freeze_table_age parameters set to
> zero." [0]
>
> David
>
> [0] https://www.postgresql.org/docs/current/sql-vacuum.html
>


Re: PG16devel - vacuum_freeze_table_age seems not being taken into account

2023-03-03 Thread Peter Geoghegan
On Fri, Mar 3, 2023 at 2:43 AM Simon Elbaz  wrote:
> I expected it not to be processed by vacuum freeze.
> However it has been entirely frozen.
> Moreover, among the 51 rows, only 1 was eligible for freeze because its XID 
> was older than vacuum_freeze_min_age.

The effect that you noticed is a consequence of page-level freezing,
which is new to Postgres 16. VACUUM will now freeze all of the tuples
on a page whenever it needs to freeze any tuples at all (barring any
tuples that are fundamentally ineligible due to being after the
removable/freezable cutoff). This is justified by the cost profile.
Once we decide to freeze at least one tuple of a page, the added cost
in WAL is low enough that it really doesn't make sense to not just
freeze everything.

The page that gets frozen by your test case is also set all-frozen in
the visibility map. Without the optimization, we'd have frozen that
one tuple and then set the page all-visible. The page would likely be
frozen again by the next aggressive VACUUM, which is usually much more
expensive.

-- 
Peter Geoghegan




Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)y

2023-03-03 Thread Thorsten Glaser
On Fri, 3 Mar 2023, Alban Hertroys wrote:

>You can rewrite that into something like this:
>
>select jsonb_build_object('opening_times’,
>   obj
>   ORDER BY
>   obj->>'weekday’,
>   obj->>'from_hour’,
>   obj->>'to_hour')
>)
>from cot
>cross join lateral jsonb_agg(jsonb_build_object(
>   'weekday', cot.weekday,
>   'from_hour', cot.from_hour,
>   'to_hour', cot.to_hour) obj

But isn’t that the same as with a regular LEFT JOIN?

>>cgwaj AS (
>>  SELECT cgwa.id AS id, jsonb_build_object(
>>  'weekday', cgwa.weekday,
>>  'forenoon', cgwa.forenoon,
>>  'afternoon', cgwa.afternoon,
>>  'evening', cgwa.evening) AS obj
>>  FROM core_generalworkavailability cgwa

plus

>>  LEFT JOIN cgwaj ON cgwaj.id=cppwt.generalworkavailability_id

With the addition that I can aggregate…

bye,
//mirabilos
-- 
15:41⎜ Somebody write a testsuite for helloworld :-)




Re: Converting row elements into a arrays?

2023-03-03 Thread Merlin Moncure
On Thu, Mar 2, 2023 at 3:47 PM Ron  wrote

> I'm used to the bog standard COUNT, AVG, MIN, MAX, SUM. It didn't occur to
> me that there would be others...
>

wait until you find out you can write your own:

CREATE OR REPLACE FUNCTION agg_leftagg(l TEXT, r anyelement) returns text as
$$
BEGIN
  RETURN CASE
WHEN l IS NOT NULL THEN  format('%s-%s', l, r)
ELSE r::TEXT
  END;
END;
$$ LANGUAGE PLPGSQL;

CREATE AGGREGATE leftagg(anyelement) (SFUNC=agg_leftagg, STYPE=TEXT);

CREATE TEMP TABLE s AS SELECT generate_series(1,10) s;

SELECT leftagg(s) FROM s GROUP BY random() > .5;

postgres=# SELECT leftagg(s) FROM s GROUP BY random() > .5;
  leftagg

 2-3-5-6-10
 1-4-7-8-9
(2 rows)

this is why I fell in love with postgres 20 years ago, and never looked back

merlion


Re: Dropping behavior for unique CONSTRAINTs

2023-03-03 Thread Ron

On 3/3/23 04:54, David Rowley wrote:

On Fri, 3 Mar 2023 at 23:17, Conner Bean  wrote:
I wanted to avoid using a unique index since dropping them requires an
exclusive lock and cannot be done concurrently. My thought was to then
use a unique constraint, since I've read unofficial docs[0] that say
these can be dropped safely with no lock.

You should try the official documents.  You won't find any wording in
those that say that a unique constraint can be dropped without any
locking.

If you look athttps://www.postgresql.org/docs/current/sql-altertable.html
you'll see "Note that the lock level required may differ for each
subform. An ACCESS EXCLUSIVE lock is acquired unless explicitly
noted.", and if you look at DROP CONSTRAINT that it mentions nothing
about any lower-level locks, so you can assume that DROP CONSTRAINT
obtains an access exclusive lock on the table being altered.

If you have a look at
https://www.postgresql.org/docs/15/sql-dropindex.html  check out the
CONCURRENTLY option. That option allows an index to be dropped without
blocking concurrent reads and writes to the table. It seems like just
having a unique index without the constraint is likely your best bet
if you can't afford to block any traffic for the brief moment it would
take to drop the constraint.


That doc page says this about CONCURRENTLY:
"
There are several caveats to be aware of when using this option. Only one 
index name can be specified, and the CASCADE option is not supported. (Thus, 
*an index that supports a UNIQUE or PRIMARY KEY constraint cannot be dropped 
this way.*)

"

--
Born in Arizona, moved to Babylonia.

Re: 13.x, stream replication and locale(?) issues

2023-03-03 Thread Thomas Munro
On Wed, Mar 1, 2023 at 10:30 AM Thomas Munro  wrote:
> On Wed, Mar 1, 2023 at 12:09 AM Eugene M. Zheganin  wrote:
> > 3) how do I fix it ? Should I take locale sources for ru_RU.utf8 on Linux 
> > and compile it on FreeBSD  - will it help ?
>
> Out of curiosity (I'm not saying it's a good idea!), do you know if
> FreeBSD's localedef can compile glibc's collation definitions?  In
> theory they are in a format standardised by POSIX...  I suspect there
> may be extensions and quirks...

Another thought: if you upgrade to 15, you could use ICU as the
default collation provider, and then make sure you have the same major
version of ICU on your Linux and FreeBSD systems (which probably means
compiling from source on FreeBSD, as FreeBSD tends to have a newish
ICU in packages, while popular stable Linux distributions tend to have
a fairly old one).  As far as I know, Linux and FreeBSD systems on
amd64 arch should match in every other important detail already (they
both follow the System V ABI, so there shouldn't be any layout
differences in eg structs that are sent over the wire AFAIK).

For what it's worth, for the future we are trying to figure out how to
support multi-version ICU so that you could explicitly set the
provider to ICU v72 to get that sort of thing working across OS
versions and even "close enough" cousins like your case, without
having to compile anything from source, but unfortunately we didn't
manage to get it ready in time for 16.




Re: Dropping behavior for unique CONSTRAINTs

2023-03-03 Thread David Rowley
On Sat, 4 Mar 2023 at 10:55, Ron  wrote:
> On 3/3/23 04:54, David Rowley wrote:
> If you have a look at
> https://www.postgresql.org/docs/15/sql-dropindex.html check out the
> CONCURRENTLY option. That option allows an index to be dropped without
> blocking concurrent reads and writes to the table. It seems like just
> having a unique index without the constraint is likely your best bet
> if you can't afford to block any traffic for the brief moment it would
> take to drop the constraint.
>
>
> That doc page says this about CONCURRENTLY:
> "
> There are several caveats to be aware of when using this option. Only one 
> index name can be specified, and the CASCADE option is not supported. (Thus, 
> an index that supports a UNIQUE or PRIMARY KEY constraint cannot be dropped 
> this way.)
> "

I'm not sure which one of these you think applies to the
recommendation I mentioned or if you were just generally highlighting
the limitations of DROP INDEX CONCURRENTLY.

David




Re: Converting row elements into a arrays?

2023-03-03 Thread Pavel Stehule
pá 3. 3. 2023 v 21:51 odesílatel Merlin Moncure  napsal:

> On Thu, Mar 2, 2023 at 3:47 PM Ron  wrote
>
>> I'm used to the bog standard COUNT, AVG, MIN, MAX, SUM. It didn't occur
>> to me that there would be others...
>>
>
> wait until you find out you can write your own:
>
> CREATE OR REPLACE FUNCTION agg_leftagg(l TEXT, r anyelement) returns text
> as
> $$
> BEGIN
>   RETURN CASE
> WHEN l IS NOT NULL THEN  format('%s-%s', l, r)
> ELSE r::TEXT
>   END;
> END;
> $$ LANGUAGE PLPGSQL;
>
> CREATE AGGREGATE leftagg(anyelement) (SFUNC=agg_leftagg, STYPE=TEXT);
>
> CREATE TEMP TABLE s AS SELECT generate_series(1,10) s;
>
> SELECT leftagg(s) FROM s GROUP BY random() > .5;
>
> postgres=# SELECT leftagg(s) FROM s GROUP BY random() > .5;
>   leftagg
> 
>  2-3-5-6-10
>  1-4-7-8-9
> (2 rows)
>
>
This can work, but can be slower for large data

fast (although not too effect :-)) way

(2023-03-04 06:22:56) postgres=# CREATE TEMP TABLE s AS SELECT
generate_series(1,10) s;
SELECT 10
(2023-03-04 06:22:57) postgres=# SELECT array_agg(s) FROM s GROUP BY
random() > .5;
┌──┐
│  array_agg   │
╞══╡
│ {3,6,8,9,10} │
│ {1,2,4,5,7}  │
└──┘
(2 rows)

(2023-03-04 06:23:21) postgres=# SELECT array_to_string(array_agg(s), '-')
FROM s GROUP BY random() > .5;
┌──┐
│ array_to_string  │
╞══╡
│ 1-2  │
│ 3-4-5-6-7-8-9-10 │
└──┘
(2 rows)

performance comparison on 1mil rows (but with enabled asserts), I modified
the query for returning 100 groups because building extra long strings are
slow

(2023-03-04 06:33:04) postgres=# EXPLAIN ANALYZE SELECT
array_to_string(array_agg(s), '-') FROM s GROUP BY (random() * 100)::int;
┌─┐
│   QUERY PLAN
   │
╞═╡
│ GroupAggregate  (cost=135257.34..165257.34 rows=100 width=36) (actual
time=715.400..1128.007 rows=101 loops=1)  │
│   Group Key: (((random() * '100'::double precision))::integer)
   │
│   ->  Sort  (cost=135257.34..137757.34 rows=100 width=8) (actual
time=712.689..853.335 rows=100 loops=1)│
│ Sort Key: (((random() * '100'::double precision))::integer)
  │
│ Sort Method: external merge  Disk: 17664kB
   │
│ ->  Seq Scan on s  (cost=0.00..21925.00 rows=100 width=8)
(actual time=6.135..192.553 rows=100 loops=1) │
│ Planning Time: 0.082 ms
  │
│ JIT:
   │
│   Functions: 7
   │
│   Options: Inlining false, Optimization false, Expressions true,
Deforming true │
│   Timing: Generation 0.484 ms, Inlining 0.000 ms, Optimization 0.276 ms,
Emission 5.877 ms, Total 6.637 ms  │
│ Execution Time: 1133.816 ms
  │
└─┘
(12 rows)

array_to_string(array_agg()) .. 1sec
agg_leftagg ..  27 sec

using final function can be faster

create or replace function agg_leftagg_final(anycompatiblearray)
returns text as $$
begin
  return array_to_string($1, '-');
end;
$$ language plpgsql;

CREATE AGGREGATE leftagg2(anycompatible) (SFUNC=array_append, STYPE =
anycompatiblearray, INITCOND = '{}', FINALFUNC = agg_leftagg_final);

(2023-03-04 06:57:18) postgres=# EXPLAIN ANALYZE SELECT leftagg2(s) FROM s
GROUP BY (random() * 100)::int;
┌───┐
│QUERY PLAN
│
╞═══╡
│ HashAggregate  (cost=78175.00..353487.50 rows=100 width=36) (actual
time=436.202..540.029 rows=101 loops=1)   │
│   Group Key: ((random() * '100'::double precision))::integer
 │
│   Planned Partitions: 256  Batches: 1  Memory Usage: 11930kB
 │
│   ->  Seq Scan on s  (cost=0.00..21925.00 rows=100 width=8) (actual
time=5.710..174.016 rows=100 loops=1) │
│ Planning Time: 0.231 ms
│
│ JIT:
 │
│   Functions: 7
 │
│   Options: Inlining false, Optimization false, Expressions true,
Deforming true