Query optimization advice for beginners

2020-01-27 Thread Kemal Ortanca
Hello mail group members,

I started a new job as PostgreSQL DBA. This is my first mail, I hope the mail I 
sent meets the rules.

There is a query that runs slowly when I look at the logs of the database. When 
I check the resources of the system, there is no problem in the resources, but 
this query running slowly.  There is no "Seq Scan" in the queries, so the 
tables are already indexed. But I did not fully understand if the indexes were 
made correctly. When I analyze the query result on explain.depesz, it seems 
that the query is taking too long.

How should I fix the query below? How should I read the output of 
explain.depesz?

Thank you in advance for your help.


select pro.id as pro_id
   , pro.code
   , coalesce(s.is_pick, false)
   , coalesce(sum(sb.quantity), 0) as pick_quantity
from mainproduct_productmetaproduction pro, order_basketitemdetail bid
left join shelf_shelvedproductbatch sb on sb.basketitem_detail_id = bid.id
left join shelf_shelvedproducts sp on sp.id = sb.shelved_product_id
left join shelf_shelf s on s.id = sp.shelf_id
where pro.id = bid.production_id
and (
  select coalesce(sum(bid.quantity), 0)
  from order_basketitem bi
   , order_basketitemdetail bid
   , order_order o
  where o.type in (2,7,9) and o.id = bi.order_id
  and o.is_cancelled = false
  and bi.is_cancelled = false
  and o.is_closed = false
  and o.is_picked = false
  and o.is_invoiced = false
  and o.is_sent = false
  and bi.id = bid.basketitem_id
  and bid.quantity > (
select 
coalesce(sum(picked_quantity),0)
from 
order_basketitembatch bib
where 
bib.detail_id=bid.id
 )
  and bid.code = pro.code
  ) > 0
group by 1,2,3 --,bid.pallet_item_quantity
having coalesce(s.is_pick, false)
and round((coalesce(sum(sb.quantity), 0) / 
GREATEST(MAX(bid.pallet_item_quantity), 1)::float)::numeric, 2) <= 0.15

https://explain.depesz.com/s/G4vq

Yours truly,
Kemal Ortanca


Re: Query optimization advice for beginners

2020-01-27 Thread Andreas Kretschmer




Am 27.01.20 um 14:15 schrieb Kemal Ortanca:


https://explain.depesz.com/s/G4vq




the estimates and the real values are very different, seems like 
problems with autoanalyze.


which version?



Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





Re: Query optimization advice for beginners

2020-01-27 Thread Kemal Ortanca
Firstly, thank you for coming back.

PostgreSQL version = 11.5

Is there a resource or postgresql configuration you want me to check in 
addition?



From: Andreas Kretschmer 
Sent: Monday, January 27, 2020 3:57 PM
To: [email protected] 

Subject: Re: Query optimization advice for beginners



Am 27.01.20 um 14:15 schrieb Kemal Ortanca:
>
> https://explain.depesz.com/s/G4vq
>
>

the estimates and the real values are very different, seems like
problems with autoanalyze.

which version?



Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





Re: Query optimization advice for beginners

2020-01-27 Thread Michael Lewis
You've got two references to order_basketitemdetail both aliased to bid and
ALSO a table called order_basketitembatch aliased to bib. I assume that
confuses the planner, but even if it doesn't it certainly confuses any new
developers trying to understand the query's intention.

The biggest thing that leaps out at me on the explain plan is the 822
thousand loops on index order_basketitembatch_detail_id_9268ccff. That
seems to be the subquery in the where clause of the subquery in the main
where clause. I never get great results when I nest sub-queries multiple
levels. Without knowing your data, we can only make guesses about
restructuring the query so it performs better.

select bi.id AS basketitem_id --coalesce(sum(bid.quantity), 0)
  from order_basketitem bi
   --, order_basketitemdetail bid
   , order_order o
  where o.type in (2,7,9) and o.id = bi.order_id
  and o.is_cancelled = false
  and bi.is_cancelled = false
  and o.is_closed = false
  and o.is_picked = false
  and o.is_invoiced = false
  and o.is_sent = false
  --and bi.id = bid.basketitem_id

For a query like the above, how restrictive is it? That is, of ALL the
records in order_basketitem table, how many are returned by the above
condition? I would think that the number of orders that have been picked or
invoiced or sent or closed or cancelled would be LARGE and so this query
may eliminate most of the orders from being considered. Not to mention the
order type id restriction.

If I found that the above query resulted in 1% of the table being returned
perhaps, there are a number of ways to influence the planner to do this
work first such as-

1) put this in a sub-query as the FROM and include OFFSET 0 hack to prevent
in-lining
2) put in a CTE using the WITH keyword (note- need to use MATERIALIZED
option once on PG12 since default behavior changes)
3) if the number of records returned is large (10 thousand maybe?) and the
optimizer is making bad choices on the rest of the query that uses this
result set, put this query into a temp table, analyze it, and then use it.

>