Re: Planning performance problem (67626.278ms)

2021-06-20 Thread Manuel Weitzman
Hello everyone,

> Apparently, the planner isn't reusing the data boundaries across alternative
> plans. It would be nicer if the planner remembered each column boundaries
> for later reuse (within the same planner execution).

I've written a very naive (and crappy) patch to show how adding
memorization to get_actual_variable_range() could help the planner on
scenarios with a big number of joins.

For the previous example,

> explain (analyze, buffers)
> select * from a
> join b b1 on (b1.a = a.a)
> join b b2 on (b2.a = a.a)
> where b1.a in (1,100,1,100,101);

each time you add a join clause the planner has to read an extra ~5[K]
buffers and gets about 200[ms] slower.

1 join
 Planning:
   Buffers: shared hit=9 read=27329
 Planning Time: 101.745 ms
 Execution Time: 0.082 ms

2 joins
 Planning:
   Buffers: shared hit=42 read=81988
 Planning Time: 303.237 ms
 Execution Time: 0.102 ms

3 joins
 Planning:
   Buffers: shared hit=94 read=136660
 Planning Time: 508.947 ms
 Execution Time: 0.155 ms

4 joins
 Planning:
   Buffers: shared hit=188 read=191322
 Planning Time: 710.981 ms
 Execution Time: 0.168 ms


After adding memorization the cost in buffers remains constant and the
latency deteriorates only marginally (as expected) with each join.

1 join
 Planning:
   Buffers: shared hit=10 read=27328
 Planning Time: 97.889 ms
 Execution Time: 0.066 ms

2 joins
 Planning:
   Buffers: shared hit=7 read=27331
 Planning Time: 100.589 ms
 Execution Time: 0.111 ms

3 joins
 Planning:
   Buffers: shared hit=9 read=27329
 Planning Time: 105.669 ms
 Execution Time: 0.134 ms

4 joins
 Planning:
   Buffers: shared hit=132 read=27370
 Planning Time: 155.716 ms
 Execution Time: 0.219 ms


I'd be happy to improve this patch into something better. Though I'd
like suggestions on how to do it:
I have this idea of creating a local "memorization" struct instance within
standard_planner(). That would require passing on a pointer down until
it reaches get_actual_variable_range(), which I think would be quite
ugly, if done just to improve the planner for this scenario.
Is there any better mechanism I could reuse from other modules? (utils
or cache, for example).


Regards,
Manuel



actual_variable_range_memorization.diff
Description: Binary data


Re: Planning performance problem (67626.278ms)

2021-06-20 Thread Tom Lane
Manuel Weitzman  writes:
> I've written a very naive (and crappy) patch to show how adding
> memorization to get_actual_variable_range() could help the planner on
> scenarios with a big number of joins.

So ... the reason why there's not caching of get_actual_variable_range
results already is that I'd supposed it wouldn't be necessary given
the caching of selectivity estimates that happens at the RestrictInfo
level.  I don't have any objection in principle to adding another
caching layer if that one's not working well enough, but I think it'd
be wise to first understand why it's needed.

regards, tom lane




Re: Planning performance problem (67626.278ms)

2021-06-20 Thread Ranier Vilela
Em dom., 20 de jun. de 2021 às 14:50, Manuel Weitzman <
[email protected]> escreveu:

> Hello everyone,
>
> > Apparently, the planner isn't reusing the data boundaries across
> alternative
> > plans. It would be nicer if the planner remembered each column boundaries
> > for later reuse (within the same planner execution).
>
> I've written a very naive (and crappy) patch to show how adding
> memorization to get_actual_variable_range() could help the planner on
> scenarios with a big number of joins.
>
> For the previous example,
>
> > explain (analyze, buffers)
> > select * from a
> > join b b1 on (b1.a = a.a)
> > join b b2 on (b2.a = a.a)
> > where b1.a in (1,100,1,100,101);
>
> each time you add a join clause the planner has to read an extra ~5[K]
> buffers and gets about 200[ms] slower.
>
> 1 join
>  Planning:
>Buffers: shared hit=9 read=27329
>  Planning Time: 101.745 ms
>  Execution Time: 0.082 ms
>
> 2 joins
>  Planning:
>Buffers: shared hit=42 read=81988
>  Planning Time: 303.237 ms
>  Execution Time: 0.102 ms
>
> 3 joins
>  Planning:
>Buffers: shared hit=94 read=136660
>  Planning Time: 508.947 ms
>  Execution Time: 0.155 ms
>
> 4 joins
>  Planning:
>Buffers: shared hit=188 read=191322
>  Planning Time: 710.981 ms
>  Execution Time: 0.168 ms
>
>
> After adding memorization the cost in buffers remains constant and the
> latency deteriorates only marginally (as expected) with each join.
>
> 1 join
>  Planning:
>Buffers: shared hit=10 read=27328
>  Planning Time: 97.889 ms
>  Execution Time: 0.066 ms
>
> 2 joins
>  Planning:
>Buffers: shared hit=7 read=27331
>  Planning Time: 100.589 ms
>  Execution Time: 0.111 ms
>
> 3 joins
>  Planning:
>Buffers: shared hit=9 read=27329
>  Planning Time: 105.669 ms
>  Execution Time: 0.134 ms
>
> 4 joins
>  Planning:
>Buffers: shared hit=132 read=27370
>  Planning Time: 155.716 ms
>  Execution Time: 0.219 ms
>
>
> I'd be happy to improve this patch into something better. Though I'd
> like suggestions on how to do it:
> I have this idea of creating a local "memorization" struct instance within
> standard_planner(). That would require passing on a pointer down until
> it reaches get_actual_variable_range(), which I think would be quite
> ugly, if done just to improve the planner for this scenario.
> Is there any better mechanism I could reuse from other modules? (utils
> or cache, for example).
>
Without going into the merits of whether this cache will be adopted or not,
I have some comments about the code.

1. Prefer to use .patch instead of .diff, it makes it easier for browsers
such as firefox to read and show the content automatically.
2. New struct?
Oid is unsigned int, lower than int64.
Better struct is:
+struct ActualVariableRangeCache {
+ int64 min_value; /* 8 bytes */
+ int64 max_value; /* 8 bytes */
+ Oid indexoid; /* 4 bytes */
+ bool has_min; /* 1 byte */
+ bool has_max; /*1 byte */
+};
Takes up less space.

3. Avoid use of type *long*, it is very problematic with 64 bits.
Windows 64 bits, long is 4 (four) bytes.
Linux 64 bits, long is 8 (eight) bytes.

4. Avoid C99 style declarations
for(unsigned long i = 0;)
Prefer:
   size_t i;
   for(i = 0;)
Helps backpatching to C89 versions.

regards,
Ranier Vilela


Re: Estimating wal_keep_size

2021-06-20 Thread Dean Gibson (DB Administrator)

On 2021-06-16 17:36, Dean Gibson (DB Administrator) wrote:

Is this reasonable thinking?

I'd think that one would want a *wal_keep_size* to cover the pending 
updates while the standby server might be unavailable, however long 
one might anticipate that would be.


In my case, I get a complete replacement (in the form of "|"- 
delimited ASCII files) of one of the SCHEMAs every Sunday.  The size 
of that ASCII data is about 2GB, so I'm thinking of doubling that to 
4GB (256 WAL files) to protect me in the case of the standby being 
unavailable during the update.  Note that a complete loss of both 
servers is not catastrophic (I have backups);  it would just be annoying.




In the absence of any clear guidance, I temporarily set wal_keep_size to 
16GB & waited for the Sunday update.  That update today created just 
over 6GB of WAL files during the update, so I've set wal_keep_size to 
8GB (512 WAL files).


Oh, and wal_keep_size is NOT an upper limit restricting further WAL 
files. It's more like a minimum.


Re: Planning performance problem (67626.278ms)

2021-06-20 Thread Tom Lane
Ranier Vilela  writes:
> 3. Avoid use of type *long*, it is very problematic with 64 bits.
> Windows 64 bits, long is 4 (four) bytes.
> Linux 64 bits, long is 8 (eight) bytes.

Agreed.

> 4. Avoid C99 style declarations
> for(unsigned long i = 0;)
> Prefer:
>size_t i;
>for(i = 0;)
> Helps backpatching to C89 versions.

It seems unlikely that we'd consider back-patching this into pre-C99
branches, so I see no reason not to use C99 loop style.  (But do
keep in mind that we avoid most other C99-isms, such as intermixed
decls and code.)

regards, tom lane