bad plan using nested loops

2018-02-01 Thread Johan Fredriksson
Hello!

I brought this issue up about two years ago but without getting any
real explanation or solution. The problem is that PostgreSQL does
really bad plans using nested loops. With "enable_nestloop = 0" the
same query is run about 20 times faster.

The sugested solution I got back then was to upgrade to the latest
version of PostgreSQL (then 9.5). It did not help. The solution we
finally applied was a horribly ugly patch to the perl-module
SearchBuilder that recognized queries that would perform badly and put
them inside transaction blocks with "SET LOCAL enable_nestloop = 0".

Last week I upgraded PostgreSQL for this application (Request Tracker)
to version 10.1 and just for fun I decied to test to remove the patch
to see if the problem still persisted. For two cases it did not. The
planner handled them just fine. For one case however, the same problem
still remains.

Bad plan: https://explain.depesz.com/s/avtZ
Good plan: https://explain.depesz.com/s/SJSt

Any suggestions on how to make the planner make better decisions for
this query?


/ Eskil




Query optimiser is not using 'not null' constraint when 'order by nulls last' clause is used

2018-02-01 Thread Nandakumar M
Hi,

I am using Postgres version 9.4.4 on a Mac machine. I have 2 queries that
differ only in the order by clause. One of it has 'nulls last' and the
other one does not have it. The performance difference between the two is
considerable.

The slower of the two queries is

SELECT wos.notificationstatus,
   wos.unrepliedcount,
   wos.shownotestotech,
   wos.ownerid,
   wos.isfcr,
   aau.user_id,
   wo.workorderid AS "WOID",
   wo.is_catalog_template AS "TemplateType",
   wo.title   AS "Title",
   wo.is_catalog_template,
   aau.first_name  AS "Requester",
   cd.categoryname AS "Category",
   ti.first_name   AS "Owner",
   wo.duebytimeAS "DueBy",
   wo.fr_duetime,
   wo.completedtime AS "CompletedTime",
   wo.respondedtime AS "RespondedTime",
   wo.resolvedtime  AS "ResolvedTime",
   qd.queuename AS "Group",
   std.statusname   AS "Status",
   wo.createdtime   AS "CreatedDate",
   wos.isread,
   wos.hasattachment,
   wos.appr_statusid,
   wos.priorityid,
   wo.templateid AS "TemplateId",
   pd.priorityid,
   pd.priorityname  AS "Priority",
   pd.prioritycolor AS "PriorityColor",
   wos.isoverdue,
   wos.is_fr_overdue,
   wos.linkedworkorderid,
   wos.editing_status,
   wos.editorid,
   wos.linkedworkorderid,
   wo.isparent,
   sduser.isvipuser,
   sduser_onbehalfof.isvipuser AS "ONBEHALFOFVIP",
   wo.isparent,
   wos.statusid,
   sdorganization.name AS "Site",
   wo.workorderid  AS "RequestID"
FROM   workorder wo
left join  workorder_fields wof
ON wo.workorderid=wof.workorderid
left join  servicecatalog_fields scf
ON wo.workorderid=scf.workorderid
left join  wotoprojects wtp
ON wo.workorderid=wtp.workorderid
left join  sitedefinition
ON wo.siteid=sitedefinition.siteid
left join  sdorganization
ON sitedefinition.siteid=sdorganization.org_id
inner join workorderstates wos
ON wo.workorderid=wos.workorderid
left join  categorydefinition cd
ON wos.categoryid=cd.categoryid
left join  aaauser ti
ON wos.ownerid=ti.user_id
left join  aaauser aau
ON wo.requesterid=aau.user_id
left join  prioritydefinition pd
ON wos.priorityid=pd.priorityid
left join  statusdefinition std
ON wos.statusid=std.statusid
left join  workorder_queue wo_queue
ON wo.workorderid=wo_queue.workorderid
left join  queuedefinition qd
ON wo_queue.queueid=qd.queueid
left join  departmentdefinition dpt
ON wo.deptid=dpt.deptid
left join  leveldefinition lvd
ON wos.levelid=lvd.levelid
left join  modedefinition mdd
ON wo.modeid=mdd.modeid
left join  urgencydefinition urgdef
ON wos.urgencyid=urgdef.urgencyid
left join  impactdefinition impdef
ON wos.impactid=impdef.impactid
left join  requesttypedefinition rtdef
ON wos.requesttypeid=rtdef.requesttypeid
left join  subcategorydefinition scd
ON wos.subcategoryid=scd.subcategoryid
left join  itemdefinition icd
ON wos.itemid=icd.itemid
left join  servicedefinition serdef
ON wo.serviceid=serdef.serviceid
left join  aaauser cbau
ON wo.createdbyid=cbau.user_id
left join  aaauser oboaau
ON wo.oboid=oboaau.user_id
left join  sduser
ON wo.requesterid=sduser.userid
left join  sduser sduser_onbehalfof
ON wo.oboid=sduser_onbehalfof.userid
left join  workorder_fields
ON wo.workorderid=workorder_fields.workorderid
WHERE  ((
 wos.statusid = 1)
   AND(
 wo.isparent = TRUE))
ORDER BY   7 DESC nulls last limit 25



On removing 'nulls last' from the order by clause the query becomes very
fast. I have attached the query plan for both the queries.

>From the plan it looks like the second query is able to efficiently use the
workorder_pk index ( The node 'Index Scan Backward using workorder_pk on
workorder' returns 25 rows) whereas the first query is not able to use the
index efficiently (more than 300k rows are returned from the same node).

The column workorderid is a PK column. The query optimizer should ideally
know that there is no nulls in this column and in effect there is no
difference between the two queries.

I tried the same in Postgres 10 and the slower query performs much better
due to parallel sequential scans but still it is less efficient than the
query without 'nulls last'.

I thought it would be best to raise this with the Postgres team.

Regards,
Nanda
pg_9_4_Fast

  QUERY PLAN
  

Re: bad plan using nested loops

2018-02-01 Thread Tom Lane
Johan Fredriksson  writes:
> Bad plan: https://explain.depesz.com/s/avtZ
> Good plan: https://explain.depesz.com/s/SJSt
> Any suggestions on how to make the planner make better decisions for
> this query?

Core of the problem looks to be the misestimation here:

Index Only Scan using shredder_cgm1 on public.cachedgroupmembers 
cachedgroupmembers_4 (cost=0.43..2.33 rows=79 width=8) (actual 
time=0.020..0.903 rows=1492 loops=804)
  Output: cachedgroupmembers_4.memberid, cachedgroupmembers_4.groupid, 
cachedgroupmembers_4.disabled
  Index Cond: ((cachedgroupmembers_4.memberid = principals_1.id) AND 
(cachedgroupmembers_4.disabled = 0))
  Heap Fetches: 5018

Probably, memberid and disabled are correlated but the planner doesn't
know that, so it thinks the index condition is way more selective than it
actually is.  In PG 10, you could very possibly fix that by installing
extended statistics on that pair of columns.  See

https://www.postgresql.org/docs/current/static/planner-stats.html#PLANNER-STATS-EXTENDED

regards, tom lane



Re: effective_io_concurrency on EBS/gp2

2018-02-01 Thread Claudio Freire
On Wed, Jan 31, 2018 at 11:21 PM, hzzhangjiazhi
 wrote:
> HI
>
>  I think this parameter will be usefull when the storage using RAID
> stripe , otherwise turn up this parameter is meaningless when only has one
> device。

Not at all. Especially on EBS, where keeping a relatively full queue
is necessary to get max thoughput out of the drive.

Problem is, if you're scanning a highly correlated index, the
mechanism is counterproductive. I had worked on some POC patches for
correcting that, I guess I could work something out, but it's
low-priority for me. Especially since it's actually a kernel "bug" (or
shortcoming), that could be fixed in the kernel rather than worked
around by postgres.



SV: bad plan using nested loops

2018-02-01 Thread Johan Fredriksson

> Johan Fredriksson  writes:
> > Bad plan: https://explain.depesz.com/s/avtZ
> > Good plan: https://explain.depesz.com/s/SJSt
> > Any suggestions on how to make the planner make better decisions for
> > this query?
> 
> Core of the problem looks to be the misestimation here:
> 
>Index Only Scan using shredder_cgm1 on public.cachedgroupmembers 
> cachedgroupmembers_4
> (cost=0.43..2.33 rows=79 width=8) (actual time=0.020..0.903 rows=1492 
> loops=804)
>  Output: cachedgroupmembers_4.memberid, cachedgroupmembers_4.groupid,
> cachedgroupmembers_4.disabled
>  Index Cond: ((cachedgroupmembers_4.memberid = principals_1.id) AND
> (cachedgroupmembers_4.disabled = 0))
>  Heap Fetches: 5018
>
> Probably, memberid and disabled are correlated but the planner doesn't
> know that, so it thinks the index condition is way more selective than it
> actually is.  In PG 10, you could very possibly fix that by installing
> extended statistics on that pair of columns.  See
> 
> https://www.postgresql.org/docs/current/static/planner-stats.html#PLANNER-STATS-EXTENDED

I'm not sure what you mean by correlated, but there are only a handful (164 
when I check it) disabled groupmembers out of total 7.5 million.
I'll give CREATE STATISTICS on those columns a shot and see if it gets any 
better.

/ Eskil