Hello,
We noticed that the following SQL queries are running 3 times slower on the
latest version of PostgreSQL. Here’s the time taken to execute them on
older (v9.5.16) and newer versions (v11.2) of PostgreSQL (in milliseconds):
+---++-+-+---+
| | scale1 | scale10 | scale50 | scale 300 |
+---++-+-+---+
| Query 1 (v9.5.16) | 88 | 937 |4721 | 27241 |
| Query 1 (v11.2) |288 |2822 | 13838 | 85081 |
+---++-+-+---+
| Query 2 (v9.5.16) | 39 | X | X | X|
| Query 2 (v11.2) | 80 | X | X | X|
+---++-+-+---+
For each regression, we share:
1) the associated query,
2) the commit that activated it,
3) our high-level analysis, and
4) query execution plans in old and new versions of PostgreSQL.
All these regressions are observed on the latest version. (v11.2 and
v9.5.16)
We found several other regression related to this commit:
77cd477 (Enable parallel query by default.)
* You can download the queries at:
https://gts3.org/~/jjung/tpcc/case3.tar.gz
* You can reproduce the result by using the same setup that we described
before (As Andrew mentioned before, we increased default work_mem to 128MB):
https://www.postgresql.org/message-id/BN6PR07MB3409922471073F2B619A8CA4EE640%40BN6PR07MB3409.namprd07.prod.outlook.com
## Query 1:
select
ref_0.ol_delivery_d as c1
from
public.order_line as ref_0
where EXISTS (
select
ref_1.i_im_id as c0
from
public.item as ref_1
where ref_0.ol_d_id <= ref_1.i_im_id
)
- Commit: 77cd477 (Enable parallel query by default.)
- Our analysis: We believe that this regression is due to parallel queries
being enabled by default. Surprisingly, we found that even on a larger
TPC-C database (scale factor of 50, roughly 4GB), parallel scan is still
slower than the non-parallel one in the old version, when the query is not
returning any tuples.
- Query Execution Plans
[OLD version]
Nested Loop Semi Join (cost=0.00..90020417940.08 rows=30005835 width=8)
(actual time=0.034..24981.895 rows=90017507 loops=1)
Join Filter: (ref_0.ol_d_id <= ref_1.i_im_id)
-> Seq Scan on order_line ref_0 (cost=0.00..2011503.04 rows=90017504
width=12) (actual time=0.022..7145.811 rows=90017507 loops=1)
-> Materialize (cost=0.00..2771.00 rows=10 width=4) (actual
time=0.000..0.000 rows=1 loops=90017507)
-> Seq Scan on item ref_1 (cost=0.00..2271.00 rows=10 width=4)
(actual time=0.006..0.006 rows=1 loops=1)
Planning time: 0.290 ms
Execution time: 27241.239 ms
[NEW version]
Gather (cost=1000.00..88047487498.82 rows=30005835 width=8) (actual
time=0.265..82355.289 rows=90017507 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop Semi Join (cost=0.00..88044485915.32 rows=12502431
width=8) (actual time=0.033..68529.259 rows=30005836 loops=3)
Join Filter: (ref_0.ol_d_id <= ref_1.i_im_id)
-> Parallel Seq Scan on order_line ref_0 (cost=0.00..1486400.93
rows=37507293 width=12) (actual time=0.023..2789.901 rows=30005836 loops=3)
-> Seq Scan on item ref_1 (cost=0.00..2271.00 rows=10 width=4)
(actual time=0.001..0.001 rows=1 loops=90017507)
Planning Time: 0.319 ms
Execution Time: 85081.158 ms
# QUERY 2
select
ref_0.c_id as c0
from
public.customer as ref_0
where EXISTS (
select
ref_0.c_city as c0
from
public.order_line as ref_1
left join public.new_order as ref_2
on (ref_1.ol_supply_w_id = ref_2.no_w_id)
where (ref_1.ol_delivery_d > ref_0.c_since)
)
- Our analysis : There seems to a problem with parallel execution logic. It
is unclear why only one worker is launched, since the default behavior of
PostgreSQL is to launch at least two workers. We found that this query does
not run to completion within a day when we increase the size of the DB.
(e.g., from scale factor 1 to scale factor 10/50/300).
- Commit: 16be2fd (Make dsa_allocate interface more like MemoryContextAlloc)
- Query execution plans:
[Old version]
Nested Loop Semi Join (cost=224.43..910152608046.08 rows=1 width=4)
(actual time=3.681..37.842 rows=3 loops=1)
Join Filter: (ref_1.ol_delivery_d > ref_0.c_since)
-> Seq Scan on customer ref_0 (cost=0.00..2569.00 rows=3 width=12)
(actual time=0.020..22.582 rows=3 loops=1)
-> Materialize (cost=224.43..48521498.97 rows=2406886812 width=8)
(actual time=0.000..0.000 rows=1 loops=3)
-> Hash Left Join (cost=224.43..27085162.91 rows=2406886812
width=8) (actual time=3.652..3.652 rows=1 loops=1)
Hash Cond: (ref_1.ol_supply_w_id = ref_2.no_w_id)
-> Seq Scan on order_line ref_1 (cost=0.00..6711.48
rows=300148 width=12) (actual time=0.016..0.016 rows=1 loops=1)
-> Hash (cost=124.19..124.19 rows=8019