Re: Performance of Query 4 on TPC-DS Benchmark

2024-11-11 Thread Ba Jinsheng

>It is all the time a challenge for PostgreSQL to estimate such a filter
>because of absent information on joint column distribution.
>Can you research this way by building extended statistics on these
>clauses? It could move the plan to the more optimal direction.

Thanks a lot for your effort to analyze this issue, and we really appreciate 
your suggestions!  Currently, we focus on exposing these issues that affect 
performance. In the future, we may consider to look into such a direction as 
you suggested.


> Have you tried any tools to improve the cardinality yet, like aqo [0]?
Yes, but it takes nearly 1 hour to run this query at a time, so I only run 
"EXPLAIN ANALYZE" once, and the performance seems slightly improved.


  QUERY PLAN

 Limit  (cost=293880.50..293880.50 rows=1 width=132) (actual 
time=2527921.078..2527921.233 rows=8 loops=1)
   CTE year_total
 ->  Gather  (cost=115049.92..233367.07 rows=384208 width=216) (actual 
time=1116.139..4005.105 rows=384208 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Append  (cost=114049.92..193946.27 rows=160087 
width=216) (actual time=2430.791..2510.131 rows=128069 loops=3)
 ->  HashAggregate  (cost=190763.57..193145.83 rows=190581 
width=216) (actual time=3977.521..4070.200 rows=190581 loops=1)
   Group Key: customer.c_customer_id, 
customer.c_first_name, customer.c_last_name, customer.c_preferred_cust_flag, 
customer.c_birth_country, customer.c_login, customer.c_email_address, 
date_dim.d_year
   Worker 1:  Batches: 1  Memory Usage: 120857kB
   ->  Hash Join  (cost=8151.60..103486.35 rows=2685453 
width=174) (actual time=64.667..1605.601 rows=2685453 loops=1)
 Hash Cond: (store_sales.ss_sold_date_sk = 
date_dim.d_date_sk)
 ->  Hash Join  (cost=5103.00..93216.88 
rows=2750652 width=174) (actual time=48.111..1121.801 rows=2750652 loops=1)
   Hash Cond: (store_sales.ss_customer_sk = 
customer.c_customer_sk)
   ->  Seq Scan on store_sales  
(cost=0.00..80552.52 rows=2880404 width=30) (actual time=0.068..230.529 
rows=2880404 loops=1)
   ->  Hash  (cost=3853.00..3853.00 rows=10 
width=152) (actual time=47.735..47.735 rows=10 loops=1)
 Buckets: 131072  Batches: 1  Memory 
Usage: 17161kB
 ->  Seq Scan on customer  
(cost=0.00..3853.00 rows=10 width=152) (actual time=0.012..25.023 
rows=10 loops=1)
 ->  Hash  (cost=2135.49..2135.49 rows=73049 
width=8) (actual time=16.242..16.242 rows=73049 loops=1)
   Buckets: 131072  Batches: 1  Memory Usage: 
3878kB
   ->  Seq Scan on date_dim  
(cost=0.00..2135.49 rows=73049 width=8) (actual time=0.074..8.744 rows=73049 
loops=1)
 ->  HashAggregate  (cost=114049.92..115762.15 rows=136978 
width=216) (actual time=2199.723..2268.851 rows=136978 loops=1)
   Group Key: customer_1.c_customer_id, 
customer_1.c_first_name, customer_1.c_last_name, 
customer_1.c_preferred_cust_flag, customer_1.c_birth_country, 
customer_1.c_login, customer_1.c_email_address, date_dim_1.d_year
   Worker 0:  Batches: 1  Memory Usage: 88089kB
   ->  Hash Join  (cost=8151.60..67544.41 rows=1430939 
width=177) (actual time=81.920..911.231 rows=1430939 loops=1)
 Hash Cond: (catalog_sales.cs_sold_date_sk = 
date_dim_1.d_date_sk)
 ->  Hash Join  (cost=5103.00..60729.97 
rows=1434519 width=177) (actual time=53.469..638.140 rows=1434519 loops=1)
   Hash Cond: 
(catalog_sales.cs_bill_customer_sk = customer_1.c_customer_sk)
   ->  Seq Scan on catalog_sales  
(cost=0.00..51842.75 rows=1441548 width=33) (actual time=0.066..134.023 
rows=1441548 loops=1)
   ->  Hash  (cost=3853.00..3853.00 rows=10 
width=152) (actual time=52.937..52.937 rows=10 loops=1)
 Buckets: 131072  Batches: 1  Memory 
Usage: 17161kB
 ->  Seq Scan on customer customer_1  
(cost=0.00..3853.00 rows=10 width=152) (actual time=0.019..27.549 
rows=10 loops=1)
 ->  Hash  (cost=2135.49..2135.49 rows=73049 
width=8) (actual time=27.968..27.968 rows=73049 loops=1)
   Buckets: 131072  Batches: 1  Memory Usage: 
3878kB
   ->  Seq Scan on date_dim date_dim_1  
(cost=0.00..2135.49 rows=73049 width=8) (actual time=0.099..14.115 rows=73049 
loops=1)
 ->  HashAggregate  (cost=61268.33..61976.44 rows=56649 
width=

Re: Performance of Query 4 on TPC-DS Benchmark

2024-11-11 Thread Ba Jinsheng

>> The default configurations of PostgreSQL incur the error: "ERROR:  could not 
>> resize shared memory segment "/PostgreSQL.3539600020" to 2097152 bytes: No 
>> space left on device"

>No comment on your optimiser experiments for now, but for this error:
>it reminds me of a low/default --shm-size limit from Docker, or other
>similar container stuff?


Aha, you are right. The error disappears if running PostgreSQL out of docker 
container.
Notice: This email is generated from the account of an NUS alumnus. Contents, 
views, and opinions therein are solely those of the sender.


Re: Performance of Query 4 on TPC-DS Benchmark

2024-11-11 Thread Andrei Lepikhov

On 11/11/24 02:35, Ba Jinsheng wrote:

Hi all,

Please see this case:


Query 4 on TPC-DS benchmark:

Thank you for interesting example!
Looking into explains I see two sortings:
->  Sort  (cost=794037.94..794037.95 rows=1 width=132)
   (actual time=3024403.310..3024403.313 rows=8 loops=1)
->  Sort  (cost=794033.93..794033.94 rows=1 width=132)
   (actual time=8068.869..8068.872 rows=8 loops=1)

Almost the same cost and different execution time. So, I think, the core 
of the problem in accuracy of selectivity estimation.

In this specific example I see lots of composite scan filters:
- ((sale_type = 'w'::text) AND (dyear = 2002))
- ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear = 
2001))
- ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear = 
2001))


It is all the time a challenge for PostgreSQL to estimate such a filter 
because of absent information on joint column distribution.
Can you research this way by building extended statistics on these 
clauses? It could move the plan to the more optimal direction.


--
regards, Andrei Lepikhov