Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)y
> On 3 Mar 2023, at 0:02, Thorsten Glaser wrote: > > On Tue, 28 Feb 2023, Alban Hertroys wrote: > >> Perhaps you can use a lateral cross join to get the result of >> jsonb_build_object as a jsonb value to pass around? > > I don’t see how. (But then I’ve not yet worked with lateral JOINs.) You posted this bit: > jsonb_build_object('opening_times', > jsonb_agg(DISTINCT jsonb_build_object( > 'weekday', cot.weekday, > 'from_hour', cot.from_hour, > 'to_hour', cot.to_hour) > ORDER BY > jsonb_build_object( > 'weekday', cot.weekday, > 'from_hour', cot.from_hour, > 'to_hour', cot.to_hour)->>'weekday', > jsonb_build_object( > 'weekday', cot.weekday, > 'from_hour', cot.from_hour, > 'to_hour', cot.to_hour)->>'from_hour', > jsonb_build_object( > 'weekday', cot.weekday, > 'from_hour', cot.from_hour, > 'to_hour', cot.to_hour)->>'to_hour') > ) You can rewrite that into something like this: select jsonb_build_object('opening_times’, obj ORDER BY obj->>'weekday’, obj->>'from_hour’, obj->>'to_hour') ) from cot cross join lateral jsonb_agg(jsonb_build_object( 'weekday', cot.weekday, 'from_hour', cot.from_hour, 'to_hour', cot.to_hour) obj That’s off the top of my head and I did leave out the DISTINCT. Just to show the concept here. A bit of experimenting and reading should get you there, I’m keeping $work waiting :P (…) > WITH >cgwaj AS ( > SELECT cgwa.id AS id, jsonb_build_object( > 'weekday', cgwa.weekday, > 'forenoon', cgwa.forenoon, > 'afternoon', cgwa.afternoon, > 'evening', cgwa.evening) AS obj > FROM core_generalworkavailability cgwa >), >-- … same for opening times > SELECT cp.email, …, > -- … > jsonb_build_object('possible_work_times', COALESCE( > jsonb_agg(DISTINCT cgwaj.obj ORDER BY cgwaj.obj->>'weekday', > cgwaj.obj->>'forenoon', cgwaj.obj->>'afternoon', > cgwaj.obj->>'evening') > FILTER (WHERE cgwaj.id IS NOT NULL))) || > -- … >FROM core_person cp > -- … > LEFT JOIN core_person_possible_work_times cppwt ON cppwt.person_id=cp.id > LEFT JOIN cgwaj ON cgwaj.id=cppwt.generalworkavailability_id > -- … > > That is, add a CTE for each m:n-attached table whose “value” is > an object, not a single field, keep the id field; LEFT JOIN that > (instead of the original table), then we have a field to use in > ORDER BY. > > I think. I’ve not yet tried it (I don’t have access to that DB > normally, I was just helping out). > > This avoids sub-SELECTs in the sense of needing to run one for > each user row, because the innermost JSON object building needs > to be done for each (connected (if the query is not filtering on > specific users)) row of the “property table”, anyway. (And even > if filtered, that can be passed down.) > > bye, > //mirabilos > -- > Solange man keine schmutzigen Tricks macht, und ich meine *wirklich* > schmutzige Tricks, wie bei einer doppelt verketteten Liste beide > Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz > hervorragend. -- Andreas Bogk über boehm-gc in d.a.s.r > Alban Hertroys -- There is always an exception to always.
Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?
Likely to be safe, you'd just include both. The problem is that the query planner makes use of equivalence classes to deduce equivalence in quals. If you have a query such as: select * from t1 inner join t2 on t1.x = t2.y where t1.x = 3; then the planner can deduce that t2.y must also be 3 and that qual can be pushed down to the scan level. If t2.y = 3 is quite selective and there's an index on that column, then this deduction is likely going to be a very good win, as the alternative of not using it requires looking at all rows in t2. Does equivalency only work for constants as in the sample you provided or will it also be found in b1."number" and t0."block_number" in my sample query? Meaning the columns could be used interchangeably in all the WHERE clauses and the ORDER clause, then it is a matter of figuring out what costs less.
Dropping behavior for unique CONSTRAINTs
Hi folks,I'm curious if there are any docs supporting the functionality behind dropping unique constraints. For context, I am interested in enforcing uniqueness on a column. This table is heavily used, and I plan on dropping the constraint in the future. I wanted to avoid using a unique index since dropping them requires an exclusive lock and cannot be done concurrently. My thought was to then use a unique constraint, since I've read unofficial docs[0] that say these can be dropped safely with no lock.However, since a unique index would be the backing index to the unique constraint, I'm curious how this would work in practice (or if it even does!). Specifically if my theory of: "I can use a unique constraint to avoid obtaining an exclusive lock on my table" actually works.Thanks for any and all information, it's greatly appreciated, and apologies if I missed any existing documentation.Best, CSB0: https://leopard.in.ua/2016/09/20/safe-and-unsafe-operations-postgresql#.ZAEFpezMJLR
Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?
On Fri, 3 Mar 2023 at 22:35, cen wrote: > Does equivalency only work for constants as in the sample you provided > or will it also be found in b1."number" and t0."block_number" in my > sample query? It works for more than constants, but in this case, it's the presence of the constant that would allow the qual to be pushed down into the scan level of the other relation. For cases such as t1 INNER JOIN t2 ON t1.x = t2.y INNER JOIN t3 ON t2.y = t3.z the equivalence classes could allow t1 to be joined to t3 using t1.x = t3.z before t2 is joined in, so certainly it still does things with classes not containing constants. No derived quals will get pushed down to the scan level without constants, however. > Meaning the columns could be used interchangeably in all the WHERE > clauses and the ORDER clause, then it is a matter of figuring out what > costs less. The problem is you have: AND (t0."block_number" >= 30926000) AND (t0."block_number" <= 31957494). The >= and <= operators are not used to help build the equivalence class. You'd see a very different plan had you just been asking for block_number = 30926000. I think your best solution will be to just also include the seemingly surplus: AND (b1."number" >= 30926000) AND (b1."number" <= 31957494) quals. David
PG16devel - vacuum_freeze_table_age seems not being taken into account
Hi List, I am doing some tests to understand vacuum_freeze_table_age and vacuum_freeze_min_age parameters. Here is my configuration: postgres=# select name, setting from pg_settings where name = 'vacuum_freeze_min_age'; name | setting ---+- vacuum_freeze_min_age | 50 (1 ligne) postgres=# select name, setting from pg_settings where name = 'vacuum_freeze_table_age'; name | setting -+- vacuum_freeze_table_age | 150 (1 ligne) test table has an age of 51. hydrodb=# SELECT c.oid::regclass as table_name, greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid WHERE c.relkind IN ('r', 'm') and c.relname='test'; table_name | age +- test | 51 (1 ligne) I expected it not to be processed by vacuum freeze. However it has been entirely frozen. Moreover, among the 51 rows, only 1 was eligible for freeze because its XID was older than vacuum_freeze_min_age. hydrodb=# vacuum verbose test; INFO: vacuuming "hydrodb.public.test" INFO: finished vacuuming "hydrodb.public.test": index scans: 0 pages: 0 removed, 447 remain, 1 scanned (0.22% of total) tuples: 0 removed, 100595 remain, 0 are dead but not yet removable removable cutoff: 1569, which was 0 XIDs old when operation ended *new relfrozenxid: 1569, which is 51 XIDs ahead of previous value* frozen: 1 pages from table (0.22% of total) had 51 tuples frozen parcours d'index non nécessaire : 0 blocs de la table (0.00% au total) ont 0 versions mortes de lignes supprimées vitesse moyenne de lecture : 0.000 Mo/s, vitesse moyenne d'écriture : 58.302 Mo/s utilisation du cache : 8 récupérés, 0 ratés, 1 modifiés utilisation des WAL : 3 enregistrements, 1 images complètes de blocs, 6302 octets utilisation du système : CPU : utilisateur : 0.00 s, système : 0.00 s, temps passé : 0.00 s VACUUM Thanks for any explanation Simon
Re: Dropping behavior for unique CONSTRAINTs
On Fri, 3 Mar 2023 at 23:17, Conner Bean wrote: I wanted to avoid using a unique index since dropping them requires an exclusive lock and cannot be done concurrently. My thought was to then use a unique constraint, since I've read unofficial docs[0] that say these can be dropped safely with no lock. You should try the official documents. You won't find any wording in those that say that a unique constraint can be dropped without any locking. If you look at https://www.postgresql.org/docs/current/sql-altertable.html you'll see "Note that the lock level required may differ for each subform. An ACCESS EXCLUSIVE lock is acquired unless explicitly noted.", and if you look at DROP CONSTRAINT that it mentions nothing about any lower-level locks, so you can assume that DROP CONSTRAINT obtains an access exclusive lock on the table being altered. If you have a look at https://www.postgresql.org/docs/15/sql-dropindex.html check out the CONCURRENTLY option. That option allows an index to be dropped without blocking concurrent reads and writes to the table. It seems like just having a unique index without the constraint is likely your best bet if you can't afford to block any traffic for the brief moment it would take to drop the constraint. David
Re: PG16devel - vacuum_freeze_table_age seems not being taken into account
On Fri, 3 Mar 2023 at 23:43, Simon Elbaz wrote: > hydrodb=# SELECT c.oid::regclass as table_name, >greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age > FROM pg_class c > LEFT JOIN pg_class t ON c.reltoastrelid = t.oid > WHERE c.relkind IN ('r', 'm') and c.relname='test'; > table_name | age > +- > test | 51 > (1 ligne) > > I expected it not to be processed by vacuum freeze. > However it has been entirely frozen. You may have missed the wording in the docs about the FREEZE option. "Specifying FREEZE is equivalent to performing VACUUM with the vacuum_freeze_min_age and vacuum_freeze_table_age parameters set to zero." [0] David [0] https://www.postgresql.org/docs/current/sql-vacuum.html
Re: PG16devel - vacuum_freeze_table_age seems not being taken into account
I ran vacuum without the freeze option as you can see below. Simon On Fri, Mar 3, 2023 at 12:01 PM David Rowley wrote: > On Fri, 3 Mar 2023 at 23:43, Simon Elbaz wrote: > > hydrodb=# SELECT c.oid::regclass as table_name, > >greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age > > FROM pg_class c > > LEFT JOIN pg_class t ON c.reltoastrelid = t.oid > > WHERE c.relkind IN ('r', 'm') and c.relname='test'; > > table_name | age > > +- > > test | 51 > > (1 ligne) > > > > I expected it not to be processed by vacuum freeze. > > However it has been entirely frozen. > > You may have missed the wording in the docs about the FREEZE option. > "Specifying FREEZE is equivalent to performing VACUUM with the > vacuum_freeze_min_age and vacuum_freeze_table_age parameters set to > zero." [0] > > David > > [0] https://www.postgresql.org/docs/current/sql-vacuum.html >
Re: PG16devel - vacuum_freeze_table_age seems not being taken into account
On Fri, Mar 3, 2023 at 2:43 AM Simon Elbaz wrote: > I expected it not to be processed by vacuum freeze. > However it has been entirely frozen. > Moreover, among the 51 rows, only 1 was eligible for freeze because its XID > was older than vacuum_freeze_min_age. The effect that you noticed is a consequence of page-level freezing, which is new to Postgres 16. VACUUM will now freeze all of the tuples on a page whenever it needs to freeze any tuples at all (barring any tuples that are fundamentally ineligible due to being after the removable/freezable cutoff). This is justified by the cost profile. Once we decide to freeze at least one tuple of a page, the added cost in WAL is low enough that it really doesn't make sense to not just freeze everything. The page that gets frozen by your test case is also set all-frozen in the visibility map. Without the optimization, we'd have frozen that one tuple and then set the page all-visible. The page would likely be frozen again by the next aggressive VACUUM, which is usually much more expensive. -- Peter Geoghegan
Re: DISTINCT *and* ORDER BY in aggregate functions on expressions(!)y
On Fri, 3 Mar 2023, Alban Hertroys wrote: >You can rewrite that into something like this: > >select jsonb_build_object('opening_times’, > obj > ORDER BY > obj->>'weekday’, > obj->>'from_hour’, > obj->>'to_hour') >) >from cot >cross join lateral jsonb_agg(jsonb_build_object( > 'weekday', cot.weekday, > 'from_hour', cot.from_hour, > 'to_hour', cot.to_hour) obj But isn’t that the same as with a regular LEFT JOIN? >>cgwaj AS ( >> SELECT cgwa.id AS id, jsonb_build_object( >> 'weekday', cgwa.weekday, >> 'forenoon', cgwa.forenoon, >> 'afternoon', cgwa.afternoon, >> 'evening', cgwa.evening) AS obj >> FROM core_generalworkavailability cgwa plus >> LEFT JOIN cgwaj ON cgwaj.id=cppwt.generalworkavailability_id With the addition that I can aggregate… bye, //mirabilos -- 15:41⎜ Somebody write a testsuite for helloworld :-)
Re: Converting row elements into a arrays?
On Thu, Mar 2, 2023 at 3:47 PM Ron wrote > I'm used to the bog standard COUNT, AVG, MIN, MAX, SUM. It didn't occur to > me that there would be others... > wait until you find out you can write your own: CREATE OR REPLACE FUNCTION agg_leftagg(l TEXT, r anyelement) returns text as $$ BEGIN RETURN CASE WHEN l IS NOT NULL THEN format('%s-%s', l, r) ELSE r::TEXT END; END; $$ LANGUAGE PLPGSQL; CREATE AGGREGATE leftagg(anyelement) (SFUNC=agg_leftagg, STYPE=TEXT); CREATE TEMP TABLE s AS SELECT generate_series(1,10) s; SELECT leftagg(s) FROM s GROUP BY random() > .5; postgres=# SELECT leftagg(s) FROM s GROUP BY random() > .5; leftagg 2-3-5-6-10 1-4-7-8-9 (2 rows) this is why I fell in love with postgres 20 years ago, and never looked back merlion
Re: Dropping behavior for unique CONSTRAINTs
On 3/3/23 04:54, David Rowley wrote: On Fri, 3 Mar 2023 at 23:17, Conner Bean wrote: I wanted to avoid using a unique index since dropping them requires an exclusive lock and cannot be done concurrently. My thought was to then use a unique constraint, since I've read unofficial docs[0] that say these can be dropped safely with no lock. You should try the official documents. You won't find any wording in those that say that a unique constraint can be dropped without any locking. If you look athttps://www.postgresql.org/docs/current/sql-altertable.html you'll see "Note that the lock level required may differ for each subform. An ACCESS EXCLUSIVE lock is acquired unless explicitly noted.", and if you look at DROP CONSTRAINT that it mentions nothing about any lower-level locks, so you can assume that DROP CONSTRAINT obtains an access exclusive lock on the table being altered. If you have a look at https://www.postgresql.org/docs/15/sql-dropindex.html check out the CONCURRENTLY option. That option allows an index to be dropped without blocking concurrent reads and writes to the table. It seems like just having a unique index without the constraint is likely your best bet if you can't afford to block any traffic for the brief moment it would take to drop the constraint. That doc page says this about CONCURRENTLY: " There are several caveats to be aware of when using this option. Only one index name can be specified, and the CASCADE option is not supported. (Thus, *an index that supports a UNIQUE or PRIMARY KEY constraint cannot be dropped this way.*) " -- Born in Arizona, moved to Babylonia.
Re: 13.x, stream replication and locale(?) issues
On Wed, Mar 1, 2023 at 10:30 AM Thomas Munro wrote: > On Wed, Mar 1, 2023 at 12:09 AM Eugene M. Zheganin wrote: > > 3) how do I fix it ? Should I take locale sources for ru_RU.utf8 on Linux > > and compile it on FreeBSD - will it help ? > > Out of curiosity (I'm not saying it's a good idea!), do you know if > FreeBSD's localedef can compile glibc's collation definitions? In > theory they are in a format standardised by POSIX... I suspect there > may be extensions and quirks... Another thought: if you upgrade to 15, you could use ICU as the default collation provider, and then make sure you have the same major version of ICU on your Linux and FreeBSD systems (which probably means compiling from source on FreeBSD, as FreeBSD tends to have a newish ICU in packages, while popular stable Linux distributions tend to have a fairly old one). As far as I know, Linux and FreeBSD systems on amd64 arch should match in every other important detail already (they both follow the System V ABI, so there shouldn't be any layout differences in eg structs that are sent over the wire AFAIK). For what it's worth, for the future we are trying to figure out how to support multi-version ICU so that you could explicitly set the provider to ICU v72 to get that sort of thing working across OS versions and even "close enough" cousins like your case, without having to compile anything from source, but unfortunately we didn't manage to get it ready in time for 16.
Re: Dropping behavior for unique CONSTRAINTs
On Sat, 4 Mar 2023 at 10:55, Ron wrote: > On 3/3/23 04:54, David Rowley wrote: > If you have a look at > https://www.postgresql.org/docs/15/sql-dropindex.html check out the > CONCURRENTLY option. That option allows an index to be dropped without > blocking concurrent reads and writes to the table. It seems like just > having a unique index without the constraint is likely your best bet > if you can't afford to block any traffic for the brief moment it would > take to drop the constraint. > > > That doc page says this about CONCURRENTLY: > " > There are several caveats to be aware of when using this option. Only one > index name can be specified, and the CASCADE option is not supported. (Thus, > an index that supports a UNIQUE or PRIMARY KEY constraint cannot be dropped > this way.) > " I'm not sure which one of these you think applies to the recommendation I mentioned or if you were just generally highlighting the limitations of DROP INDEX CONCURRENTLY. David
Re: Converting row elements into a arrays?
pá 3. 3. 2023 v 21:51 odesílatel Merlin Moncure napsal: > On Thu, Mar 2, 2023 at 3:47 PM Ron wrote > >> I'm used to the bog standard COUNT, AVG, MIN, MAX, SUM. It didn't occur >> to me that there would be others... >> > > wait until you find out you can write your own: > > CREATE OR REPLACE FUNCTION agg_leftagg(l TEXT, r anyelement) returns text > as > $$ > BEGIN > RETURN CASE > WHEN l IS NOT NULL THEN format('%s-%s', l, r) > ELSE r::TEXT > END; > END; > $$ LANGUAGE PLPGSQL; > > CREATE AGGREGATE leftagg(anyelement) (SFUNC=agg_leftagg, STYPE=TEXT); > > CREATE TEMP TABLE s AS SELECT generate_series(1,10) s; > > SELECT leftagg(s) FROM s GROUP BY random() > .5; > > postgres=# SELECT leftagg(s) FROM s GROUP BY random() > .5; > leftagg > > 2-3-5-6-10 > 1-4-7-8-9 > (2 rows) > > This can work, but can be slower for large data fast (although not too effect :-)) way (2023-03-04 06:22:56) postgres=# CREATE TEMP TABLE s AS SELECT generate_series(1,10) s; SELECT 10 (2023-03-04 06:22:57) postgres=# SELECT array_agg(s) FROM s GROUP BY random() > .5; ┌──┐ │ array_agg │ ╞══╡ │ {3,6,8,9,10} │ │ {1,2,4,5,7} │ └──┘ (2 rows) (2023-03-04 06:23:21) postgres=# SELECT array_to_string(array_agg(s), '-') FROM s GROUP BY random() > .5; ┌──┐ │ array_to_string │ ╞══╡ │ 1-2 │ │ 3-4-5-6-7-8-9-10 │ └──┘ (2 rows) performance comparison on 1mil rows (but with enabled asserts), I modified the query for returning 100 groups because building extra long strings are slow (2023-03-04 06:33:04) postgres=# EXPLAIN ANALYZE SELECT array_to_string(array_agg(s), '-') FROM s GROUP BY (random() * 100)::int; ┌─┐ │ QUERY PLAN │ ╞═╡ │ GroupAggregate (cost=135257.34..165257.34 rows=100 width=36) (actual time=715.400..1128.007 rows=101 loops=1) │ │ Group Key: (((random() * '100'::double precision))::integer) │ │ -> Sort (cost=135257.34..137757.34 rows=100 width=8) (actual time=712.689..853.335 rows=100 loops=1)│ │ Sort Key: (((random() * '100'::double precision))::integer) │ │ Sort Method: external merge Disk: 17664kB │ │ -> Seq Scan on s (cost=0.00..21925.00 rows=100 width=8) (actual time=6.135..192.553 rows=100 loops=1) │ │ Planning Time: 0.082 ms │ │ JIT: │ │ Functions: 7 │ │ Options: Inlining false, Optimization false, Expressions true, Deforming true │ │ Timing: Generation 0.484 ms, Inlining 0.000 ms, Optimization 0.276 ms, Emission 5.877 ms, Total 6.637 ms │ │ Execution Time: 1133.816 ms │ └─┘ (12 rows) array_to_string(array_agg()) .. 1sec agg_leftagg .. 27 sec using final function can be faster create or replace function agg_leftagg_final(anycompatiblearray) returns text as $$ begin return array_to_string($1, '-'); end; $$ language plpgsql; CREATE AGGREGATE leftagg2(anycompatible) (SFUNC=array_append, STYPE = anycompatiblearray, INITCOND = '{}', FINALFUNC = agg_leftagg_final); (2023-03-04 06:57:18) postgres=# EXPLAIN ANALYZE SELECT leftagg2(s) FROM s GROUP BY (random() * 100)::int; ┌───┐ │QUERY PLAN │ ╞═══╡ │ HashAggregate (cost=78175.00..353487.50 rows=100 width=36) (actual time=436.202..540.029 rows=101 loops=1) │ │ Group Key: ((random() * '100'::double precision))::integer │ │ Planned Partitions: 256 Batches: 1 Memory Usage: 11930kB │ │ -> Seq Scan on s (cost=0.00..21925.00 rows=100 width=8) (actual time=5.710..174.016 rows=100 loops=1) │ │ Planning Time: 0.231 ms │ │ JIT: │ │ Functions: 7 │ │ Options: Inlining false, Optimization false, Expressions true, Deforming true