Re: Why could different data in a table be processed with different performance?

2018-09-27 Thread Fabio Pardi



On 09/26/2018 07:15 PM, Vladimir Ryabtsev wrote:
>> Since you have a very big toast table, given you are using spinning
> disks, I think that increasing the block size will bring benefits.
> But will it worsen caching? I will have lesser slots in cache. Also will
> it affect required storage space?


I think in your case it will not worsen the cache. You will have lesser
slots in the cache, but the total available cache will indeed be
unchanged (half the blocks of double the size). It could affect space
storage, for the smaller blocks. Much depends which block size you
choose and how is actually your data distributed in the ranges you
mentioned. (eg: range 10K -20 might be more on the 10 or more on the 20
side.).

Imagine you request a record of 24 KB, and you are using 8KB blocks. It
will result in 3 different block lookup/request/returned. Those 3 blocks
might be displaced on disk, resulting maybe in 3 different lookups.
Having all in one block, avoids this problem.
The cons is that if you need to store 8KB of data, you will allocate 24KB.
You say you do not do updates, so it might also be the case that when
you write data all at once (24 KB in one go) it goes all together in a
contiguous strip. Therefore the block size change here will bring nothing.
This is very much data and usage driven. To change block size is a
painful thing, because IIRC you do that at db initialization time

Similarly, if your RAID controller uses for instance 128KB blocks, each
time you are reading one block of 8KB, it will return to you a whole
128KB chunk, which is quite a waste of resources.

If your 'slow' range is maybe fragmented here and there on the disk, not
having a proper alignment between Postgres blocks/ Filesystem/RAID
might worsen the problem of orders of magnitude. This is very true on
spinning disks, where the seek time is noticeable.

Note that trying to set a very small block size has the opposite effect:
you might hit the IOPS of your hardware, and create a bottleneck. (been
there while benchmarking some new hardware)

But before going through all this, I would first try to reload the data
with dump+restore into a new machine, and see how it behaves.

Hope it helps.

regards,

fabio pardi

> 
>>> consecutive runs with SAME parameters do NOT hit the disk, only the
> first one does, consequent ones read only from buffer cache.
>> I m  a bit confused.. every query you pasted contains 'read':
>>        Buffers: shared hit=50 read=2378
>> and 'read' means you are reading from disk (or OS cache). Or not? 
> Yes, sorry, it was just my misunderstanding of what is "consecutive". To
> make it clear: I iterate over all data in table with one request and
> different parameters on each iteration (e.g. + 5000 both borders), in
> this case I get disk reads on each query run (much more reads on "slow"
> range). But if I request data from an area queried previously, it reads
> from cache and does not hit disk (both ranges). E.g. iterating over 1M
> of records with empty cache takes ~11 minutes in "fast" range and ~1
> hour in "slow" range, while on second time it takes only ~2 minutes for
> both ranges (if I don't do drop_caches).
> 
> Regards,
> Vlad
> 



SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Arjun Ranade
I have a strange performance situation that I cannot resolve with my usual
process.

I have a SELECT statement that completes in about 12 seconds for the full
result (~1100 rows).

If I create an empty table first, and then INSERT with the SELECT query, it
takes 6.5 minutes.

When I look at the EXPLAIN ANALYZE output, it seems that it's using a
drastically different query plan for the INSERT+SELECT than SELECT by
itself.

Here's the explain plan for the SELECT() by itself:
https://explain.depesz.com/s/8Qmr

Here's the explain plan for INSERT INTO x SELECT():
https://explain.depesz.com/s/qifT

I am running Postgresql 10(PostgreSQL 10.4 on x86_64-pc-linux-gnu, compiled
by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit).

Shared Buffers = 4gb
effective_cache_size = 4gb
work_mem = 8gb
wal_buffers = -1
max_wal_sze = 2gb
wal_level = replica
archiving on
Total RAM on machine: 252GB

This machine is VACUUM FULL,ANALYZE once a week.  Autovac is ON with PG10
default settings.

The machine has 12 Intel(R) Xeon(R) CPU E5-2643 v3 @ 3.40GHz, and 15k RPM
disks for Postgres.  I have tested write speed to all filesystems and
speeds are as expected.  The pg_wal is on a separate disk resource,
however, these disks are also 15k in speed and setup the same way as
Postgres data disks.

The queries are sensitive so I had to obfuscate them in the explain plans.
I am reluctant to provide full metadata for all the objects involved, but
will if it comes to that.  I first want to understand why the query plan
would be so different for a SELECT vs INSERT into X SELECT.  I also tried
CREATE TABLE x as SELECT() but it also takes 6+ minutes.

Is there any advice as to the general case on why SELECT can finish in
10seconds but CREATE TABLE as SELECT() runs in 7 minutes?

Any advice would be much appreciated.

Thanks,
Arjun Ranade


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Tom Lane
Arjun Ranade  writes:
> I have a strange performance situation that I cannot resolve with my usual
> process.
> I have a SELECT statement that completes in about 12 seconds for the full
> result (~1100 rows).
> If I create an empty table first, and then INSERT with the SELECT query, it
> takes 6.5 minutes.

> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The reason for the plan shape difference is probably that the bare SELECT
is allowed to use parallelism while INSERT/SELECT isn't.  I'm not sure
to what extent we could relax that without creating semantic gotchas.

However, your real problem with either query is that the planner's
rowcount estimates are off by several orders of magnitude.  If you could
improve that, you'd likely get better plan choices in both cases.

I also notice that this seems to be a 14-way join, which means you're
probably getting an artificially poor plan as a result of 
from_collapse_limit and/or join_collapse_limit constraining the planner's
search space.  Maybe raising those limits would help, although I'm not
sure how much it'd help if the rowcount estimates aren't improved.

Since you haven't told us much of anything about the actual query or the
data, it's hard to offer concrete advice beyond that.

regards, tom lane



Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Arjun Ranade
Hi Tom,

Thank you for your suggestions.  I tried increasing from_collapse_limit and
join_collapse_limit to 16 in a specific session and that significantly
improved my query performance (it takes < 2s now).  Now, my instinct is to
increase this globally but I'm sure there are some drawbacks to this so I
will need to read more about it.

Your point about parallelism is interesting, I hadn't considered that.

Even after working with Postgres for years, there really is a lot to learn
about query optimization that is new for me.  I'd never heard of these
parameters before your email since almost every performance issue I've had
thus far was resolved by creating an index or smarter query re-writing.

I'm reading the documentation regarding these specific parameters, but it's
written as a reference page as opposed to an explanation into query
planning and optimization.  I wonder if there is a class or book these
details better.

Anyway, thank you so much for pointing me in the right direction.

Best,
Arjun

On Thu, Sep 27, 2018 at 1:21 PM Tom Lane  wrote:

> Arjun Ranade  writes:
> > I have a strange performance situation that I cannot resolve with my
> usual
> > process.
> > I have a SELECT statement that completes in about 12 seconds for the full
> > result (~1100 rows).
> > If I create an empty table first, and then INSERT with the SELECT query,
> it
> > takes 6.5 minutes.
>
> > When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> > drastically different query plan for the INSERT+SELECT than SELECT by
> > itself.
>
> The reason for the plan shape difference is probably that the bare SELECT
> is allowed to use parallelism while INSERT/SELECT isn't.  I'm not sure
> to what extent we could relax that without creating semantic gotchas.
>
> However, your real problem with either query is that the planner's
> rowcount estimates are off by several orders of magnitude.  If you could
> improve that, you'd likely get better plan choices in both cases.
>
> I also notice that this seems to be a 14-way join, which means you're
> probably getting an artificially poor plan as a result of
> from_collapse_limit and/or join_collapse_limit constraining the planner's
> search space.  Maybe raising those limits would help, although I'm not
> sure how much it'd help if the rowcount estimates aren't improved.
>
> Since you haven't told us much of anything about the actual query or the
> data, it's hard to offer concrete advice beyond that.
>
> regards, tom lane
>


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Justin Pryzby
On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote:
> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The fast, SELECT plan is using parallel query, which isn't available for 
INSERT+SELECT:

https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html
|Even when it is in general possible for parallel query plans to be generated, 
the planner will not generate them for a given query if any of the following 
are true:
|The query writes any data or locks any database rows.

Using parallel query in this case happens to mitigate the effects of the bad
plan.

I see Tom responded, and you got an improvement by changing join threshold.

But I think you could perhaps get an better plan if the rowcount estimates were
fixed.  That's more important than probably anything else - changing settings
is only a workaround for bad estimates.

In the slow/INSERT plan, this join is returning 55000x more rows than expected
(not 55k more: 55k TIMES more).

7.  26,937.132  401,503.136 ↓ 55,483.7  332,902 1   
Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual 
time=311.021..401,503.136 rows=332,902 loops=1)
Join Filter: (((papa_echo.oscar_bravo)::text = 
(five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text = 
(five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text = 
(five_hotel.papa_victor)::text))
Rows Removed by Join Filter: 351664882
Buffers: shared hit=8570619 read=6

First question is if all those conditions are independent?  Or if one of those
conditions also implies another, which is confusing the planner.

Justin



Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Arjun Ranade
Yes, that join is concerning (red text below).  The conditions all need to
be checked so they are independent.

The query (with consistent obfuscation) is below :

select distinct
   a.sale_id
   , a.test_date
   , a.product_id as original_product_id
   ,vw2.product_id
   , a.volume as volume
   ,b.pair_rank
from not_sold_locations a
   inner join vw_product vw2 using
(product_group_name,product_class_code,product_type_code,sale_end_date)
   inner join product_mapping b on a.product_group_name =
b.left_product_group_name and
   a.product_node_name = b.left_product_node and
   a.product_type_code = b.left_product and
   vw2.product_node_name = b.right_product_node and
   vw2.product_group_name =
b.right_product_group_name and
   vw2.product_type_code = b.right_product
   inner join mapping_ref i on vw2.product_group_name || '.' ||
vw2.product_node_name = i.product_node_name and
vw2.product_class_code = i.product_class_code and
vw2.product_type_code = i.product_type_code and
vw2.sale_end_date between i.first_product_date
and i.last_product_date;

not_sold_locations(a) has 836 rows
vw_product (vw2) has 785k rows and is a view that joins 11 tables
together to have a consolidated view of all products, sales locations,
etc

product_mapping (b) has 2520 rows

mapping_ref (i) has 178 rows



On Thu, Sep 27, 2018 at 2:52 PM Justin Pryzby  wrote:

> On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote:
> > When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> > drastically different query plan for the INSERT+SELECT than SELECT by
> > itself.
>
> The fast, SELECT plan is using parallel query, which isn't available for
> INSERT+SELECT:
>
>
> https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html
> |Even when it is in general possible for parallel query plans to be
> generated, the planner will not generate them for a given query if any of
> the following are true:
> |The query writes any data or locks any database rows.
>
> Using parallel query in this case happens to mitigate the effects of the
> bad
> plan.
>
> I see Tom responded, and you got an improvement by changing join threshold.
>
> But I think you could perhaps get an better plan if the rowcount estimates
> were
> fixed.  That's more important than probably anything else - changing
> settings
> is only a workaround for bad estimates.
>
> In the slow/INSERT plan, this join is returning 55000x more rows than
> expected
> (not 55k more: 55k TIMES more).
>
> 7.  26,937.132  401,503.136 ↓ 55,483.7  332,902 1
>
> Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual
> time=311.021..401,503.136 rows=332,902 loops=1)
> Join Filter: (((papa_echo.oscar_bravo)::text =
> (five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text =
> (five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text =
> (five_hotel.papa_victor)::text))
> Rows Removed by Join Filter: 351664882
> Buffers: shared hit=8570619 read=6
>
> First question is if all those conditions are independent?  Or if one of
> those
> conditions also implies another, which is confusing the planner.
>
> Justin
>


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Justin Pryzby
On Thu, Sep 27, 2018 at 03:37:57PM -0400, Arjun Ranade wrote:
> Yes, that join is concerning (red text below).  The conditions all need to
> be checked so they are independent.

You can play with the join conditions to see which test is getting such a bad
estimate, or if it's a combination of tests (as I suspected) giving a bad
estimate.

There's a good chance this one isn't doing very well:

>vw2.product_group_name ||'.'|| vw2.product_node_name = i.product_node_name

As a workaround/test, you could maybe add an expression index
ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) )

..and then ANALYZE.  Eventually, you'd want to consider splitting
i.product_node_name into separate columns.  

Justin



Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Arjun Ranade
"As a workaround/test, you could maybe add an expression index
ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) )"

Unfortunately, vw2 is a view, but I had a similar thought.  I'm looking
into splitting i.product-node_name into separate columns though, thanks!


On Thu, Sep 27, 2018 at 3:33 PM Justin Pryzby  wrote:

> On Thu, Sep 27, 2018 at 03:37:57PM -0400, Arjun Ranade wrote:
> > Yes, that join is concerning (red text below).  The conditions all need
> to
> > be checked so they are independent.
>
> You can play with the join conditions to see which test is getting such a
> bad
> estimate, or if it's a combination of tests (as I suspected) giving a bad
> estimate.
>
> There's a good chance this one isn't doing very well:
>
> >vw2.product_group_name ||'.'|| vw2.product_node_name =
> i.product_node_name
>
> As a workaround/test, you could maybe add an expression index
> ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) )
>
> ..and then ANALYZE.  Eventually, you'd want to consider splitting
> i.product_node_name into separate columns.
>
> Justin
>


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Vladimir Ryabtsev
> The reason for the plan shape difference is probably that the bare SELECT
> is allowed to use parallelism while INSERT/SELECT isn't.
In case parallelism is used, should it report in the plan as something like
"workers planned: N"?

Vlad


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Tom Lane
Vladimir Ryabtsev  writes:
>> The reason for the plan shape difference is probably that the bare SELECT
>> is allowed to use parallelism while INSERT/SELECT isn't.

> In case parallelism is used, should it report in the plan as something like
> "workers planned: N"?

It did --- see the Gather node.

regards, tom lane



Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Vladimir Ryabtsev
> It did --- see the Gather node.
But "workers launched: 1"...
To my opinion, such a dramatic difference cannot be explained with avoiding
parallelism, the query was just stuck in a very inefficient plan (even
though almost all source data is read from cache).

Additionally, I think author can try CREATE STATISTICS on the bunch of
columns used in join. Very low rows estimate for this join may come from
multiplying selectivities for each column assuming they are independent.

Vlad


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes

2018-09-27 Thread Justin Pryzby
On Thu, Sep 27, 2018 at 04:50:36PM -0700, Vladimir Ryabtsev wrote:
> Additionally, I think author can try CREATE STATISTICS on the bunch of
> columns used in join. Very low rows estimate for this join may come from
> multiplying selectivities for each column assuming they are independent.

MV statistics don't currently help for joins:
https://www.postgresql.org/message-id/flat/CAKJS1f-6B7KnDFrh6SFhYn-YbHYOXmDDAfd0XC%3DjJKZMCrfQyg%40mail.gmail.com#925e19951fabc9a480b804d661d83be8

Justin



Re: To keep indexes in memory, is large enough effective_cache_size enough?

2018-09-27 Thread Sam R.
Hi!
> The double buffering> itself does not slow anything down. 

That was what I was suspecting a little. Double buffering may not matter in our 
case, because the whole server is meant for PostgreSQL only.
In our case, we can e.g. reserve almost "all memory" for PostgreSQL (shared 
buffers etc.).

Please correct me if I am wrong.
BR Sam


 
 
  On ti, syysk. 25, 2018 at 23:55, David Rowley 
wrote:   On Tue, 25 Sep 2018 at 18:36, Sam R.  wrote:
> Regarding double buffering: I do not know how much double buffering would 
> slow down operations.
> It could also be possible to turn off kernel page cache on our DB server, to 
> avoid double buffering. Although, we may still keep it in use.

I think you've misunderstood double buffering.  The double buffering
itself does not slow anything down. If the buffer is in shared buffers
already then it does not need to look any further for it. Double
buffering only becomes an issue when buffers existing 2 times in
memory causes other useful buffers to appear 0 times.

-- 
 David Rowley                  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services