Re: Awkward Join between generate_series and long table

2023-11-09 Thread Philip Semanchuk



> On Nov 8, 2023, at 8:26 PM, Lincoln Swaine-Moore  
> wrote:
> 
> Hi all--
> 
> I'm having a performance problem in 12.16 that I'm hoping someone can help 
> with.



> Thanks for any and all help and suggestions.


Hi Lincoln,
I haven't read your SQL carefully so I may be completely off base, but I wanted 
to share an experience I had with generate_series() that caused some planner 
headaches that may be affecting you too. 

I was using generate_series() in a CROSS JOIN with a large table. The call to 
generate_series() only emitted a small number of rows (4 - 24) but the planner 
estimated it would emit 1000 rows because that's Postgres' default in absence 
of other info. (See 
https://www.postgresql.org/docs/current/sql-createfunction.html, "The default 
assumption is 1000 rows.") I see an estimate for 1000 rows in your EXPLAIN 
output too, so you're experiencing the same although in your case the estimate 
of 1000 might be more accurate. The misestimation was causing significant 
performance problems for me.

My solution was to wrap generate_series() in a custom function that had a ROWS 
qualifier (documented at the same link as above) to better inform the planner. 
In my case I used ROWS 16 since that was relatively accurate -- a lot more 
accurate than 1000, anyway.

Then I found that my pure SQL custom function was getting inlined, which caused 
the information in the ROWS  qualifier to get lost. :-) I rewrote it in 
PL/pgSQL to prevent the inlining, and that solution worked well for me. (See 
convo at 
https://www.postgresql.org/message-id/flat/76B16E5F-59D0-4C97-8DBA-4B3BB21E2009%40americanefficient.com)

On another note, I have also seen unexpected performance gains from introducing 
LATERAL into a JOIN. My guess is that I got lucky, and that the use of LATERAL 
sent the planner down a better path. 

Hope this is at least a little helpful!

Good luck,
Philip



RE: [EXTERNAL] Performance down with JDBC 42

2023-11-09 Thread Abraham, Danny
Hi guys,
Thanks for the help.
I was able to recreate the problem , on the same DB, with PSQL only. No JDBC.

A plain run of a complicated query :  50ms
A prepare and then execute of the same query:   2500ms.

The plans are different, as discussed above. The fast one is using Materialize 
and Memoize.

Thanks

Danny




Re: Awkward Join between generate_series and long table

2023-11-09 Thread Lincoln Swaine-Moore
> I see an estimate for 1000 rows in your EXPLAIN output too, so you're
experiencing the same
> although in your case the estimate of 1000 might be more accurate. The
misestimation was causing
> significant performance problems for me.

> My solution was to wrap generate_series() in a custom function that had a
ROWS qualifier

That's interesting! I actually wasn't familiar with the ROWs feature at
all, so that is good knowledge to pocket.

In my case, I think the number of rows will vary quite a bit for different
time periods/resolutions (and 1000 might not be a bad estimate for some of
the workloads). I do wonder whether if the planner had a sense of how big
the series result could be for longer periods/finer resolutions (which is a
bit of information I could actually trivially generate outside and encode
into the query explicitly if need be), it might avoid/minimize the NESTED
LOOP at all costs, but I'm not sure how to communicate that information.

Anyway, thank you for sharing! Very helpful to hear what other people have
dealt with in similar situations.