Re: Why could different data in a table be processed with different performance?
On 09/26/2018 07:15 PM, Vladimir Ryabtsev wrote: >> Since you have a very big toast table, given you are using spinning > disks, I think that increasing the block size will bring benefits. > But will it worsen caching? I will have lesser slots in cache. Also will > it affect required storage space? I think in your case it will not worsen the cache. You will have lesser slots in the cache, but the total available cache will indeed be unchanged (half the blocks of double the size). It could affect space storage, for the smaller blocks. Much depends which block size you choose and how is actually your data distributed in the ranges you mentioned. (eg: range 10K -20 might be more on the 10 or more on the 20 side.). Imagine you request a record of 24 KB, and you are using 8KB blocks. It will result in 3 different block lookup/request/returned. Those 3 blocks might be displaced on disk, resulting maybe in 3 different lookups. Having all in one block, avoids this problem. The cons is that if you need to store 8KB of data, you will allocate 24KB. You say you do not do updates, so it might also be the case that when you write data all at once (24 KB in one go) it goes all together in a contiguous strip. Therefore the block size change here will bring nothing. This is very much data and usage driven. To change block size is a painful thing, because IIRC you do that at db initialization time Similarly, if your RAID controller uses for instance 128KB blocks, each time you are reading one block of 8KB, it will return to you a whole 128KB chunk, which is quite a waste of resources. If your 'slow' range is maybe fragmented here and there on the disk, not having a proper alignment between Postgres blocks/ Filesystem/RAID might worsen the problem of orders of magnitude. This is very true on spinning disks, where the seek time is noticeable. Note that trying to set a very small block size has the opposite effect: you might hit the IOPS of your hardware, and create a bottleneck. (been there while benchmarking some new hardware) But before going through all this, I would first try to reload the data with dump+restore into a new machine, and see how it behaves. Hope it helps. regards, fabio pardi > >>> consecutive runs with SAME parameters do NOT hit the disk, only the > first one does, consequent ones read only from buffer cache. >> I m a bit confused.. every query you pasted contains 'read': >> Buffers: shared hit=50 read=2378 >> and 'read' means you are reading from disk (or OS cache). Or not? > Yes, sorry, it was just my misunderstanding of what is "consecutive". To > make it clear: I iterate over all data in table with one request and > different parameters on each iteration (e.g. + 5000 both borders), in > this case I get disk reads on each query run (much more reads on "slow" > range). But if I request data from an area queried previously, it reads > from cache and does not hit disk (both ranges). E.g. iterating over 1M > of records with empty cache takes ~11 minutes in "fast" range and ~1 > hour in "slow" range, while on second time it takes only ~2 minutes for > both ranges (if I don't do drop_caches). > > Regards, > Vlad >
SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
I have a strange performance situation that I cannot resolve with my usual process. I have a SELECT statement that completes in about 12 seconds for the full result (~1100 rows). If I create an empty table first, and then INSERT with the SELECT query, it takes 6.5 minutes. When I look at the EXPLAIN ANALYZE output, it seems that it's using a drastically different query plan for the INSERT+SELECT than SELECT by itself. Here's the explain plan for the SELECT() by itself: https://explain.depesz.com/s/8Qmr Here's the explain plan for INSERT INTO x SELECT(): https://explain.depesz.com/s/qifT I am running Postgresql 10(PostgreSQL 10.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit). Shared Buffers = 4gb effective_cache_size = 4gb work_mem = 8gb wal_buffers = -1 max_wal_sze = 2gb wal_level = replica archiving on Total RAM on machine: 252GB This machine is VACUUM FULL,ANALYZE once a week. Autovac is ON with PG10 default settings. The machine has 12 Intel(R) Xeon(R) CPU E5-2643 v3 @ 3.40GHz, and 15k RPM disks for Postgres. I have tested write speed to all filesystems and speeds are as expected. The pg_wal is on a separate disk resource, however, these disks are also 15k in speed and setup the same way as Postgres data disks. The queries are sensitive so I had to obfuscate them in the explain plans. I am reluctant to provide full metadata for all the objects involved, but will if it comes to that. I first want to understand why the query plan would be so different for a SELECT vs INSERT into X SELECT. I also tried CREATE TABLE x as SELECT() but it also takes 6+ minutes. Is there any advice as to the general case on why SELECT can finish in 10seconds but CREATE TABLE as SELECT() runs in 7 minutes? Any advice would be much appreciated. Thanks, Arjun Ranade
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
Arjun Ranade writes: > I have a strange performance situation that I cannot resolve with my usual > process. > I have a SELECT statement that completes in about 12 seconds for the full > result (~1100 rows). > If I create an empty table first, and then INSERT with the SELECT query, it > takes 6.5 minutes. > When I look at the EXPLAIN ANALYZE output, it seems that it's using a > drastically different query plan for the INSERT+SELECT than SELECT by > itself. The reason for the plan shape difference is probably that the bare SELECT is allowed to use parallelism while INSERT/SELECT isn't. I'm not sure to what extent we could relax that without creating semantic gotchas. However, your real problem with either query is that the planner's rowcount estimates are off by several orders of magnitude. If you could improve that, you'd likely get better plan choices in both cases. I also notice that this seems to be a 14-way join, which means you're probably getting an artificially poor plan as a result of from_collapse_limit and/or join_collapse_limit constraining the planner's search space. Maybe raising those limits would help, although I'm not sure how much it'd help if the rowcount estimates aren't improved. Since you haven't told us much of anything about the actual query or the data, it's hard to offer concrete advice beyond that. regards, tom lane
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
Hi Tom, Thank you for your suggestions. I tried increasing from_collapse_limit and join_collapse_limit to 16 in a specific session and that significantly improved my query performance (it takes < 2s now). Now, my instinct is to increase this globally but I'm sure there are some drawbacks to this so I will need to read more about it. Your point about parallelism is interesting, I hadn't considered that. Even after working with Postgres for years, there really is a lot to learn about query optimization that is new for me. I'd never heard of these parameters before your email since almost every performance issue I've had thus far was resolved by creating an index or smarter query re-writing. I'm reading the documentation regarding these specific parameters, but it's written as a reference page as opposed to an explanation into query planning and optimization. I wonder if there is a class or book these details better. Anyway, thank you so much for pointing me in the right direction. Best, Arjun On Thu, Sep 27, 2018 at 1:21 PM Tom Lane wrote: > Arjun Ranade writes: > > I have a strange performance situation that I cannot resolve with my > usual > > process. > > I have a SELECT statement that completes in about 12 seconds for the full > > result (~1100 rows). > > If I create an empty table first, and then INSERT with the SELECT query, > it > > takes 6.5 minutes. > > > When I look at the EXPLAIN ANALYZE output, it seems that it's using a > > drastically different query plan for the INSERT+SELECT than SELECT by > > itself. > > The reason for the plan shape difference is probably that the bare SELECT > is allowed to use parallelism while INSERT/SELECT isn't. I'm not sure > to what extent we could relax that without creating semantic gotchas. > > However, your real problem with either query is that the planner's > rowcount estimates are off by several orders of magnitude. If you could > improve that, you'd likely get better plan choices in both cases. > > I also notice that this seems to be a 14-way join, which means you're > probably getting an artificially poor plan as a result of > from_collapse_limit and/or join_collapse_limit constraining the planner's > search space. Maybe raising those limits would help, although I'm not > sure how much it'd help if the rowcount estimates aren't improved. > > Since you haven't told us much of anything about the actual query or the > data, it's hard to offer concrete advice beyond that. > > regards, tom lane >
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote: > When I look at the EXPLAIN ANALYZE output, it seems that it's using a > drastically different query plan for the INSERT+SELECT than SELECT by > itself. The fast, SELECT plan is using parallel query, which isn't available for INSERT+SELECT: https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html |Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a given query if any of the following are true: |The query writes any data or locks any database rows. Using parallel query in this case happens to mitigate the effects of the bad plan. I see Tom responded, and you got an improvement by changing join threshold. But I think you could perhaps get an better plan if the rowcount estimates were fixed. That's more important than probably anything else - changing settings is only a workaround for bad estimates. In the slow/INSERT plan, this join is returning 55000x more rows than expected (not 55k more: 55k TIMES more). 7. 26,937.132 401,503.136 ↓ 55,483.7 332,902 1 Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual time=311.021..401,503.136 rows=332,902 loops=1) Join Filter: (((papa_echo.oscar_bravo)::text = (five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text = (five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text = (five_hotel.papa_victor)::text)) Rows Removed by Join Filter: 351664882 Buffers: shared hit=8570619 read=6 First question is if all those conditions are independent? Or if one of those conditions also implies another, which is confusing the planner. Justin
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
Yes, that join is concerning (red text below). The conditions all need to be checked so they are independent. The query (with consistent obfuscation) is below : select distinct a.sale_id , a.test_date , a.product_id as original_product_id ,vw2.product_id , a.volume as volume ,b.pair_rank from not_sold_locations a inner join vw_product vw2 using (product_group_name,product_class_code,product_type_code,sale_end_date) inner join product_mapping b on a.product_group_name = b.left_product_group_name and a.product_node_name = b.left_product_node and a.product_type_code = b.left_product and vw2.product_node_name = b.right_product_node and vw2.product_group_name = b.right_product_group_name and vw2.product_type_code = b.right_product inner join mapping_ref i on vw2.product_group_name || '.' || vw2.product_node_name = i.product_node_name and vw2.product_class_code = i.product_class_code and vw2.product_type_code = i.product_type_code and vw2.sale_end_date between i.first_product_date and i.last_product_date; not_sold_locations(a) has 836 rows vw_product (vw2) has 785k rows and is a view that joins 11 tables together to have a consolidated view of all products, sales locations, etc product_mapping (b) has 2520 rows mapping_ref (i) has 178 rows On Thu, Sep 27, 2018 at 2:52 PM Justin Pryzby wrote: > On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote: > > When I look at the EXPLAIN ANALYZE output, it seems that it's using a > > drastically different query plan for the INSERT+SELECT than SELECT by > > itself. > > The fast, SELECT plan is using parallel query, which isn't available for > INSERT+SELECT: > > > https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html > |Even when it is in general possible for parallel query plans to be > generated, the planner will not generate them for a given query if any of > the following are true: > |The query writes any data or locks any database rows. > > Using parallel query in this case happens to mitigate the effects of the > bad > plan. > > I see Tom responded, and you got an improvement by changing join threshold. > > But I think you could perhaps get an better plan if the rowcount estimates > were > fixed. That's more important than probably anything else - changing > settings > is only a workaround for bad estimates. > > In the slow/INSERT plan, this join is returning 55000x more rows than > expected > (not 55k more: 55k TIMES more). > > 7. 26,937.132 401,503.136 ↓ 55,483.7 332,902 1 > > Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual > time=311.021..401,503.136 rows=332,902 loops=1) > Join Filter: (((papa_echo.oscar_bravo)::text = > (five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text = > (five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text = > (five_hotel.papa_victor)::text)) > Rows Removed by Join Filter: 351664882 > Buffers: shared hit=8570619 read=6 > > First question is if all those conditions are independent? Or if one of > those > conditions also implies another, which is confusing the planner. > > Justin >
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
On Thu, Sep 27, 2018 at 03:37:57PM -0400, Arjun Ranade wrote: > Yes, that join is concerning (red text below). The conditions all need to > be checked so they are independent. You can play with the join conditions to see which test is getting such a bad estimate, or if it's a combination of tests (as I suspected) giving a bad estimate. There's a good chance this one isn't doing very well: >vw2.product_group_name ||'.'|| vw2.product_node_name = i.product_node_name As a workaround/test, you could maybe add an expression index ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) ) ..and then ANALYZE. Eventually, you'd want to consider splitting i.product_node_name into separate columns. Justin
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
"As a workaround/test, you could maybe add an expression index ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) )" Unfortunately, vw2 is a view, but I had a similar thought. I'm looking into splitting i.product-node_name into separate columns though, thanks! On Thu, Sep 27, 2018 at 3:33 PM Justin Pryzby wrote: > On Thu, Sep 27, 2018 at 03:37:57PM -0400, Arjun Ranade wrote: > > Yes, that join is concerning (red text below). The conditions all need > to > > be checked so they are independent. > > You can play with the join conditions to see which test is getting such a > bad > estimate, or if it's a combination of tests (as I suspected) giving a bad > estimate. > > There's a good chance this one isn't doing very well: > > >vw2.product_group_name ||'.'|| vw2.product_node_name = > i.product_node_name > > As a workaround/test, you could maybe add an expression index > ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) ) > > ..and then ANALYZE. Eventually, you'd want to consider splitting > i.product_node_name into separate columns. > > Justin >
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
> The reason for the plan shape difference is probably that the bare SELECT > is allowed to use parallelism while INSERT/SELECT isn't. In case parallelism is used, should it report in the plan as something like "workers planned: N"? Vlad
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
Vladimir Ryabtsev writes: >> The reason for the plan shape difference is probably that the bare SELECT >> is allowed to use parallelism while INSERT/SELECT isn't. > In case parallelism is used, should it report in the plan as something like > "workers planned: N"? It did --- see the Gather node. regards, tom lane
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
> It did --- see the Gather node. But "workers launched: 1"... To my opinion, such a dramatic difference cannot be explained with avoiding parallelism, the query was just stuck in a very inefficient plan (even though almost all source data is read from cache). Additionally, I think author can try CREATE STATISTICS on the bunch of columns used in join. Very low rows estimate for this join may come from multiplying selectivities for each column assuming they are independent. Vlad
Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE AS with same SELECT takes 7 minutes
On Thu, Sep 27, 2018 at 04:50:36PM -0700, Vladimir Ryabtsev wrote: > Additionally, I think author can try CREATE STATISTICS on the bunch of > columns used in join. Very low rows estimate for this join may come from > multiplying selectivities for each column assuming they are independent. MV statistics don't currently help for joins: https://www.postgresql.org/message-id/flat/CAKJS1f-6B7KnDFrh6SFhYn-YbHYOXmDDAfd0XC%3DjJKZMCrfQyg%40mail.gmail.com#925e19951fabc9a480b804d661d83be8 Justin
Re: To keep indexes in memory, is large enough effective_cache_size enough?
Hi! > The double buffering> itself does not slow anything down. That was what I was suspecting a little. Double buffering may not matter in our case, because the whole server is meant for PostgreSQL only. In our case, we can e.g. reserve almost "all memory" for PostgreSQL (shared buffers etc.). Please correct me if I am wrong. BR Sam On ti, syysk. 25, 2018 at 23:55, David Rowley wrote: On Tue, 25 Sep 2018 at 18:36, Sam R. wrote: > Regarding double buffering: I do not know how much double buffering would > slow down operations. > It could also be possible to turn off kernel page cache on our DB server, to > avoid double buffering. Although, we may still keep it in use. I think you've misunderstood double buffering. The double buffering itself does not slow anything down. If the buffer is in shared buffers already then it does not need to look any further for it. Double buffering only becomes an issue when buffers existing 2 times in memory causes other useful buffers to appear 0 times. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
