Performance degradation with CTEs, switching from PG 11 to PG 15

2023-11-22 Thread Jean-Christophe Boggio

Hello,

I just switched from PG11 to PG15 on our production server (Version is 
15.5). Just made a vacuum full analyze on the DB.


I have a relatively simple query that used to be fast and is now taking 
very long (from less than 10 seconds to 3mn+)


If I remove a WHERE condition changes the calculation time dramatically. 
The result is not exactly the same but that extra filtering seems very 
long...


Also, adding "materialized" to both "withcwrack" and "withcwrack0" CTEs 
gets the result in acceptable timings (a few seconds). The problem with 
this is that we have some clients with older versions of PG and I guess 
blindly adding the "materialized" keyword will cause errors.


Is there anything I can do to prevent that kind of behaviour ? I'm a 
little afraid to have to review all the queries in my softwares to keep 
good performances with PG 15 ? Maybe there's a way to configure the 
server so that CTEs are materialized by default ? Not ideal but I could 
slowly refine queries to enforce "not materialized" and benefit from the 
improvement without affecting all our users.


Thanks for your inputs.

JC


Here is the query:

explain (analyze,buffers)
WITH myselect AS (
 SELECT DISTINCT og.idoeu
 FROM oegroupes og
 WHERE (og.idgroupe = 4470)
)
   , withcwrack0 AS (
    SELECT idoeu, idthirdparty, ackcode
    FROM (
 SELECT imd.idoeu,
    imd.idthirdparty,
    imd.ackcode,
    RANK() OVER (PARTITION BY imd.idoeu, 
imd.idthirdparty ORDER BY imd.idimport DESC) AS rang

 FROM importdetails imd
 WHERE imd.ackcode NOT IN ('RA', '')
    ) x
    WHERE x.rang = 1
)
   , withcwrack AS (
   SELECT idoeu,
  STRING_AGG(DISTINCT tp.nom, ', ' ORDER BY 
tp.nom) FILTER (WHERE ackcode IN ('AS', 'AC', 'NP', 'DU')) AS cwrackok,
  STRING_AGG(DISTINCT tp.nom, ', ' ORDER BY 
tp.nom) FILTER (WHERE ackcode IN ('CO', 'RJ', 'RC')) AS cwracknotok

   FROM withcwrack0
   JOIN thirdparty tp USING (idthirdparty)
   GROUP BY idoeu
)
SELECT DISTINCT og.idoegroupe,
    og.idoeu,
    o.titrelong,
    o.created,
    o.datedepotsacem,
    s.nom AS companyname,
    na.aggname AS actorsnames,
    COALESCE(TRIM(o.repnom1), '') || COALESCE(' / ' || 
TRIM(o.repnom2), '') ||
    COALESCE(' / ' || TRIM(o.repnom3), '') AS 
actorsnamesinfosrepart,

    o.cocv AS favcode,
    o.contrattiredufilm,
    o.interprete,
    o.codecocv,
    o.idsociete,
    o.idimport,
    o.donotexport,
    o.observations,
    withcwrack.cwracknotok AS cwracknotok,
    withcwrack.cwrackok AS cwrackok,
    oghl.idgroupe IS NOT NULL AS list_highlight1
FROM oegroupes og
JOIN myselect ON myselect.idoeu = og.idoeu
JOIN oeu o ON o.idoeu = og.idoeu
LEFT JOIN societes s ON s.idsociete = o.idsociete
LEFT JOIN nomsad na ON na.idoeu = o.idoeu
LEFT JOIN withcwrack ON withcwrack.idoeu = o.idoeu
LEFT JOIN oegroupes oghl ON o.idoeu = oghl.idoeu AND oghl.idgroupe = NULL

-- Commenting out the following line makes the query fast :

    WHERE (og.idgroupe=4470)





Fast version (without the final where) :

Unique  (cost=.76..8906.76 rows=360 width=273) (actual 
time=343.424..345.687 rows=3004 loops=1)

  Buffers: shared hit=26366
  ->  Sort  (cost=.76..8889.66 rows=360 width=273) (actual 
time=343.422..343.742 rows=3004 loops=1)
    Sort Key: og.idoegroupe, og.idoeu, o.titrelong, o.created, 
o.datedepotsacem, s.nom, na.aggname, (((COALESCE(TRIM(BOTH FROM 
o.repnom1), ''::text) || COALESCE((' / '::text || TRIM(BOTH FROM 
o.repnom2)), ''::text)) || COALESCE((' / '::text || TRIM(BOTH FROM 
o.repnom3)), ''::text))), o.cocv, o.contrattiredufilm, o.interprete, 
(codecocv(o.*)), o.idsociete, o.idimport, o.donotexport, o.observations, 
(string_agg(DISTINCT (tp.nom)::text, ', '::text ORDER BY (tp.nom)::text) 
FILTER (WHERE ((x.ackcode)::text = ANY ('{CO,RJ,RC}'::text[], 
(string_agg(DISTINCT (tp.nom)::text, ', '::text ORDER BY (tp.nom)::text) 
FILTER (WHERE ((x.ackcode)::text = ANY ('{AS,AC,NP,DU}'::text[], 
((idgroupe IS NOT NULL))

    Sort Method: quicksort  Memory: 524kB
    Buffers: shared hit=26366
    ->  Nested Loop Left Join  (cost=6811.39..8873.48 rows=360 
width=273) (actual time=291.636..340.755 rows=3004 loops=1)

  Join Filter: false
  Buffers: shared hit=26355
  ->  Nested Loop  (cost=6811.39..8773.58 rows=360 
width=2964) (actual time=290.747..301.506 rows=3004 loops=1)

    Join Filter: (og_1.idoe

Re: Performance degradation with CTEs, switching from PG 11 to PG 15

2023-11-22 Thread John Naylor
On Wed, Nov 22, 2023 at 6:39 PM Jean-Christophe Boggio
 wrote:
>
> Hello,
>
> I just switched from PG11 to PG15 on our production server (Version is
> 15.5). Just made a vacuum full analyze on the DB.

Note that "vacuum full" is not recommended practice in most
situations. Among the downsides, it removes the visibility map, which
is necessary to allow index-only scans. Plain vacuum should always be
used except for certain dire situations. Before proceeding further,
please perform a plain vacuum on the DB. After that, check if there
are still problems with your queries.

> Also, adding "materialized" to both "withcwrack" and "withcwrack0" CTEs
> gets the result in acceptable timings (a few seconds). The problem with
> this is that we have some clients with older versions of PG and I guess
> blindly adding the "materialized" keyword will cause errors.

Yes, meaning 11 and earlier don't recognize that keyword keyword.

> Is there anything I can do to prevent that kind of behaviour ? I'm a
> little afraid to have to review all the queries in my softwares to keep
> good performances with PG 15 ? Maybe there's a way to configure the
> server so that CTEs are materialized by default ?

There is no such a way. It would be surely be useful for some users to
have a way to slowly migrate query plans to new planner versions, but
that's not how it works today.




Re: Performance degradation with CTEs, switching from PG 11 to PG 15

2023-11-22 Thread Jean-Christophe Boggio

John,

Le 22/11/2023 à 14:30, John Naylor a écrit :
Note that "vacuum full" is not recommended practice in most  > situations. Among the downsides, it removes the visibility map, > 
which is necessary to allow index-only scans. Plain vacuum should > 
always be used except for certain dire situations. Before proceeding > 
further, please perform a plain vacuum on the DB. After that, check > if 
there are still problems with your queries.
Did both VACUUM ANALYZE and VACUUM (which one did you recommend 
exactly?) and things go much faster now, thanks a lot. I will also check 
why autovacuum did not do its job.


Is there anything I can do to prevent that kind of behaviour ? I'm  >> a little afraid to have to review all the queries in my softwares 
>> to keep good performances with PG 15 ? Maybe there's a way to >> 
configure the server so that CTEs are materialized by default ? > > 
There is no such a way. It would be surely be useful for some users > to 
have a way to slowly migrate query plans to new planner versions, > but 
that's not how it works today.
Thanks for your input so I know I did not miss a parameter. And yes, 
that would be handy.


Best regards,






Re: Performance degradation with CTEs, switching from PG 11 to PG 15

2023-11-22 Thread Andreas Kretschmer




Am 22.11.23 um 12:38 schrieb Jean-Christophe Boggio:



Also, adding "materialized" to both "withcwrack" and "withcwrack0" 
CTEs gets the result in acceptable timings (a few seconds). The 
problem with this is that we have some clients with older versions of 
PG and I guess blindly adding the "materialized" keyword will cause 
errors.




yeah, prior to 11 CTEs are a optimizer barrier. You can try to rewrite 
the queries to not using CTEs - or upgrade. If i were you i would upgrade.



Regards, Andreas

--
Andreas Kretschmer - currently still (garden leave)
Technical Account Manager (TAM)
www.enterprisedb.com





Re: Performance degradation with CTEs, switching from PG 11 to PG 15

2023-11-22 Thread Jean-Christophe Boggio

Andreas,

Le 22/11/2023 à 15:25, Andreas Kretschmer a écrit :
Am 22.11.23 um 12:38 schrieb Jean-Christophe Boggio:  >> Also, adding "materialized" to both "withcwrack" and "withcwrack0" 
>> CTEs gets the result in acceptable timings (a few seconds). The >> 
problem with this is that we have some clients with older versions >> of 
PG and I guess blindly adding the "materialized" keyword will >> cause 
errors. > yeah, prior to 11 CTEs are a optimizer barrier. You can try to 
> rewrite the queries to not using CTEs - or upgrade. If i were you i > 
would upgrade.
I did upgrade :-) But we have many users for which we don't decide on 
when they do upgrade so we have to keep compatibility with most versions 
of PG and in that particular case (non-existence of the materialized 
keyword for PG 11 and before) it is a real problem.


Best regards,

JC





Re: Performance degradation with CTEs, switching from PG 11 to PG 15

2023-11-22 Thread Tom Lane
Jean-Christophe Boggio  writes:
> I did upgrade :-) But we have many users for which we don't decide on 
> when they do upgrade so we have to keep compatibility with most versions 
> of PG and in that particular case (non-existence of the materialized 
> keyword for PG 11 and before) it is a real problem.

PG 11 is out of support as of earlier this month, so your users really
need to be prioritizing getting onto more modern versions.

regards, tom lane